Mar 16, 2021
It was an honor to sit down with two of the creators of Power
Query, Sid Jayadevan and Miguel Llopis. You get to hear the
history of Power Query from their perspective! These guys are so
busy it took quite a bit of coordination to get them together to
record this episode. We hope you enjoy it as much as we
did!
References In This Episode:
Reese's Peanut Butter Cups Old School Commercial
Episode Timeline:
Episode Transcript:
Rob Collie (00:00:00):
Hello Friends, this week's guests are Miguel and Sid, the data
integrations team at Microsoft. They were both around and involved
when Power Query essentially emerged from the primordial ooze at
Microsoft. And that's what this week's episode is mostly about,
which is, what is the Power Query origin story? I've long been
fascinated by that question as an outsider and observer, since this
all happened after I left Microsoft. It's kind of hard to imagine a
time when we didn't have Power Query. But there was actually three
or four years in there, when we were building DAX data models for
clients without benefit of Power Query. And this is where I use my
old man voice. Kids these days, they have no idea how easy they
have it, back in our day, we didn't have fancy Power Query, you
just had to cobble the data together by hand. And so I'm not just
fascinated by the origin story here, I'm actually deeply impressed
and appreciative, especially as a former software engineer, who
knows how challenging it is, just how gorgeous Power Query really
is.
Rob Collie (00:01:06):
And because of that, along those lines, during this conversation, I
kept trying to get these two gentlemen to take the victory lap.
They didn't take that bait, too humble, too cognizant of the work
yet to be done, which of course, is really how we would want it,
isn't it? Because seriously, it's good to know that there are
people like this at Microsoft, who are stewards, really, of our
futures. They're the ones building, not just the tools that we
already use, but the tools that we're going to use in the future.
There are a few places in here where we geek out a little bit as
computer scientists, but mostly, the conversation stayed very
firmly rooted in the human beings, the human element again, which
is what it's all about. I'm sincerely honored that they took two
hours out of their busy schedules to spend speaking with me and
speaking to you on this podcast. I hope you enjoy it. I hope you
learn something from it. So let's get into it.
Announcer (00:02:03):
Ladies and gentlemen, may I have your attention, please.
Announcer (00:02:07):
This is the Raw Data by P3 Podcast, with your host, Rob Collie.
Find out what the experts at P3 can do for your business, go to
powerpivotpro.com. Raw Data by P3 is data with the human
element.
Rob Collie (00:02:23):
Welcome to the show, Sid and Miguel, how are you today fine
gentlemen?
Sid Jayadevan (00:02:27):
Very well, thank you. Many thanks for having us.
Miguel Llopis (00:02:30):
Yeah, doing great, Rob. Thank you so much for having us, it's a
pleasure.
Rob Collie (00:02:33):
Seriously, to get a hold of the two of you, and coordinate
calendars and make this happen, it's an honor for us. So this was
something that almost from the moment we launched the podcast,
asking members of our team at P3, what would be interesting topics.
One of them that just sort of keeps insistently coming up is where
did Power Query come from? And so I was on this mission to hunt
down some of the people who could speak to that. And there's a lot
of things we can talk about. What are your roles today at
Microsoft?
Miguel Llopis (00:03:06):
Sid, do you want to go first?
Sid Jayadevan (00:03:07):
Yep. So I am an engineering manager at Microsoft Manage, the team
that works on data integration, with Power Query being one of the
key elements, but we have a variety of other things we do with
connectors, gateways, data flows, all of which are very symbiotic.
Power Query depends on all of those things, and all of those things
depend on Power Query. So there's a larger space that we operate
within, but Power Query has been around the longest of all of those
things.
Rob Collie (00:03:39):
So I didn't know this going in. But I have some complaints about
this QuickBooks connector. I'm sure you know what those complaints
are too. They will not be ones that you've heard the first time
from me making a note of that.
Sid Jayadevan (00:03:55):
No points for guessing.
Rob Collie (00:03:56):
No points. Okay. Miguel, what are your responsibilities today?
Miguel Llopis (00:03:59):
I'm the program manager lead for Power Query, and connectors and
data flows. So a bunch of technologies and experiences that Sid was
talking about. Been up on the team for quite a while, maybe about
the same as Sid, maybe a bit less. He's older than me, so that
would show through this interview.
Rob Collie (00:04:17):
Oh, yeah. Well, we're only recording the audio here, you can't
really tell.
Miguel Llopis (00:04:22):
I meant from his wisdom, not from the look.
Rob Collie (00:04:24):
Oh, I see.
Sid Jayadevan (00:04:26):
That's Miguel's way of saying I'm Palaeolithic.
Miguel Llopis (00:04:30):
There you go.
Rob Collie (00:04:30):
Oh, yeah. I see. You're also a free range gluten free?
Miguel Llopis (00:04:35):
Of course.
Rob Collie (00:04:39):
So you both go back aways on Power Query specifically. So from my
perspective, from outside the company, I left Redmond in 2009. And
then I sort of formerly left Microsoft in February of 2010. So I've
been formally gone for 11 years, and sort of informally gone for a
little longer than that. It's really kind of hard for me to even
get back into the before mindset, when we had DAX, and data
modeling, and we had the VertiPaq Engine, they're all still so
incredibly central to Power BI today. And we had them in the Power
Pivot form, we didn't even have SSAS Tabular yet.
Rob Collie (00:05:16):
But we had nothing in this giant void that Power Query came along,
to me, basically, out of the blue. I had no advance notice of this.
My sources in Redmond, my spies, they hadn't hooked me up with the
information that something amazing was coming to come along and
complement the tools. There are so many times, guys, seriously so
many times where I or we would be working with a client and we
would know what the ideal data model would look like. But they
couldn't do it because the data that they were getting wasn't in
the right format to build the right data model. Even something as
simple as you need a lookup table or dimension table, and no one's
giving it to you.
Rob Collie (00:05:58):
So it was always like, "Oh, now you need to go find a DBA," that
you're either lucky or you're not, you either had one or you
didn't. And if you were unlucky, you were just out of luck, there
was no recourse. You were just going to have to make these changes
to these tables manually. There was no automatic refresh anymore.
It was really a tremendous limitation. And then suddenly, and at
the time when it first arrived, what was it? The first name? Was it
Data Explorer?
Miguel Llopis (00:06:27):
Yes, Data Explorer.
Rob Collie (00:06:30):
Even that initial name sort of conveyed a different mission than
what it sort of morphed into. I don't even really remember. It was
also a way to connect to lots of different things, wasn't it? And
Power Query is today. Were you around when Data Explorer, when that
name was chosen? What was behind that?
Miguel Llopis (00:06:46):
Yeah, that would be an interesting story. So actually, just going
one step before that, and I think we're getting towards 2012 at
this point, I think. The very first incarnation of what today's
Power Query in market was something called SQL Azure lab for data
exploration. It didn't even have an actual product name, we're
going to call it that way. It was SQL Azure Labs was a set of
initiatives across the SQL and national teams back in the day to
actually a spike different sets of technologies that will help in
different segments, like data exploration will be one, data
visualization. There were a bunch of things that came out that
way.
Miguel Llopis (00:07:22):
It was actually a full cloud based Power Query like experience to
actually connect to data, transform data, and then output data in
different ways. There's ways to actually get your data out as an
all data endpoint that you could use maybe to create an app, maybe
to consume from Power Pivot, to your point drop. The feedback back
then, and again 2012 was, hey, Pivot is working in Excel, we want
these experiences in Excel. So we repivoted all of that, no pun
intended with repivoting, towards a client base experience that was
actually an Excel add in that we initially released for Excel 2010
and 2013.
Miguel Llopis (00:07:59):
There was quite a bit of naming related discussions, but I think
the data exploration aspect, everyone got used to it. So we ended
up coming up with Microsoft codename Data Explorer for Excel. That
was a very first name for the Excel add in, which later than that,
a few months later, as we went into GA, they actually got renamed
to Power Query. And really the alignment there was with the power
family, the power tools, Power Pivot, Power View, Power Maps back
in the day as well. And then Power Query as a way to actually bring
data in. Maybe Sid remembers some of these discussions more than
me. I know there was also an alternative option, which was Power
Import, that really, we went for the term query because it really
reinforced the notion of repeatability and refresh ability of those
queries, no pun intended.
Sid Jayadevan (00:08:45):
I think we tested many, many terms, and import was just one of
them. But we felt like the essence of the product was that ability
to query ad hoc and at will, and so we really wanted to focus on
that. And so that's how we ended up with the query piece of it.
Rob Collie (00:09:03):
I think Power Query was a great name. Honestly, it didn't really
land for me what y'all were giving us when it was still called Data
Explorer. That name was actually a very large cognitive obstacle
for me, explorer, it sounds like an analysis tool. Now, knowing
where the roots were, you had a name, there's a more appropriate
name for what you were doing, probably when it was still that Azure
Lab thing. But it's so funny, this happens all the time, when your
mission pivots, certain parts of it still kind of leak through,
like the old name, by default.
Rob Collie (00:09:36):
Just like the old story, whether it's true or not about the
railroad tracks being the width that they are because the Romans
chariot was that wide. It's just like, what we did yesterday is
because we did it that way the day before and whatever. Certain
things just have a momentum that carried forward. When the name
changed to Power Query, suddenly, I was like, "Oh, okay, this is
awesome." But it was Data Explorer, it was just a really cool
novelty. I didn't have a sense of its purpose, or I didn't feel
like it was a serious tool yet. It's really kind of interesting the
power of naming, isn't it?
Sid Jayadevan (00:10:08):
Neither did we, to a large extent, we were trying to find that
identity. And I think as we got deeper into it, the query first
aspect became a lot clearer.
Rob Collie (00:10:20):
I also think the Power Query versus Power Import, I think the right
decision was made there. The repeatability, you know this, but I'm
going to say it anyway, if I only get five minutes with an Excel
person who's never been exposed to the power platform, I have five
minutes and I have to drop their jaw, I'm going to show him Power
Query. I'm not going to show them the data model, I can show them
DAX. Now, I think that ultimately, you absolutely need to be using
both. But Power Query is such an amazing life changer for the Excel
crowd and they can immediately appreciate what it's going to do for
them.
Rob Collie (00:10:57):
It's harder for them to appreciate what the data model is going to
do for them. Which is why, if I've got five minutes, I don't go
subtle. Isn't it amazing? You're talking about figuring out your
own mission over time, that was something amazing. It sounds like
from what you're saying that the M language, and the engine that
goes with it, and all the stuff that's really difficult to build
and to design, a lot of that was already kind of done before the
Excel crowd became a focus. Is that true?
Sid Jayadevan (00:11:28):
Yeah, that's true. Power Query is essentially a visual interface on
top of the M language. The M language is absolutely the essence of
the product, it's the foundation. And that foundation was built
before the product as was often necessary, you need the foundation
in place. And there is a long history around M that predates 2012,
by let's just say, several years. And we won't delve into all the
details, but much like we had to get clear about what Power Query
was for and who we were targeting, there was a similar process with
M. M was from a technology point of view, this very simple, yet
powerful thing, in that it was functional, it composed, in our
opinion, reasonably well. And so you could do lots of different
things with it. But we wanted to put it in the hands of lots and
lots of people who didn't necessarily have not even a programming
background, but necessarily a query background.
Sid Jayadevan (00:12:29):
And so that was the goal that we set for ourselves to bring all of
those people on board, to make things possible for them that
perhaps were a little harder in the past. And so M was really the
foundation, and it was well in place CIRCA 2012, we made some
changes to make it more friendly to the visual experience, to make
it just a little more designer tool friendly, if you will. But the
core of the language was already in place. And on the language
front, we had tried lots of different things. And many, many people
at Microsoft were involved in that effort at many stages.
Rob Collie (00:13:08):
Something you said there really struck me and I wouldn't have
thought about it this way until I heard the history. You made some
changes, just some almost cosmetic changes to the language to make
it more friendly to the visual composer aspect of Power Query. As
soon as you said that, I'm like, "Oh, my god, yeah, this tool is
super, super, super friendly to being edited and written from a
graphical tool." A lot of times, you can go back into the M code
and hand edit it and the visual editor is still completely okay. It
totally understands what you did. That sort of round trip of hand
editing and visual composer, exposing both to the user and still
having a language and a tool that survives that duality, that's a
challenge. That's a really big challenge. I've tried it multiple
times at Microsoft. And I think I'm old for a lifetime on having
designed a system that worked like that. So I can certainly
appreciate it from a software engineering perspective, even just
that one little detail of a language that was already pre built.
That's kind of amazing.
Sid Jayadevan (00:14:08):
We're still working on it, it's very much a work in progress. And
there are aspects that are a little more, what should I say,
language oriented, that remain an M that remain extremely powerful
that the visual interface doesn't leverage quite as much.
Rob Collie (00:14:25):
I've been talking to Miguel about this. And it's not just a deep
power, but I think the Power Query transform, there should be many
more tabs, because the language is so flexible. And I know that not
everything can be turned into visual, some things are just
absolutely going to forever remain 100% in the realm of I have to
hand edit the M. I'm really nothing but a fan here.
Rob Collie (00:14:45):
I told Miguel on a previous call that back in the early 2000s, when
I was on the Excel team, and I caught the XML bug. The first
feature set that I was a lead program manager for was the XML
import and export capabilities. Not the XML file format, but data
payloads, invoices or whatever, be able to move those in and out of
Excel. And I spent two, four years of my life chasing a dream that
I call Data Merge. There are huge, elaborate graphical mock ups of
all of this, it was a really ambitious. It's exactly the kind of
project you'd expect a young software engineer to get all amped up
about and geeked out about.
Rob Collie (00:15:27):
The thing I thought we needed to do was build some sort of
repeatable data transformation logic into Excel. And I tried so
hard to get budget, to get approval, to get greenlit to build a
team to do this. And I got shot down four times a year for multiple
years, every quarter, I make another run at it, like, "Come on,
let's do this." Now that I've seen what you built, I am so glad
that I never got approval to dive into that. Because once I've seen
what it looks, a complete solution to it, I realized, oh my gosh,
we were so overmatched. We would have never, ever succeeded, never
come close. The fact that you had to go build a language first,
makes sense to me now in hindsight, but it's really chilling. Oh my
gosh, imagine they let me do my passion project. Thank you, Richard
McAniff for never believing in me.
Miguel Llopis (00:16:24):
Well, I'm sure you-
Sid Jayadevan (00:16:26):
Yeah, not so sure about that, Rob. You might have done a lot
better.
Rob Collie (00:16:30):
I doubt it.
Sid Jayadevan (00:16:31):
[crosstalk 00:16:31] Well, we'll never know.
Rob Collie (00:16:33):
Well, I do, and I wouldn't have it.
Miguel Llopis (00:16:36):
Well, Rob, I think you're being too hard on yourself. We don't have
answers to everything. I assume you would have just like us, just
fail fast, learn fast, iterate, learn from customers, learn from
you, Sid, and just refine and get better over time. Here we are 10
years later, or eight years later, and we still have a lot of
things to improve on.
Rob Collie (00:16:56):
It's not even really just about me, it's also that Office doesn't
have the right culture, to do something like what you've done. We
would have gone and tried to solve a handful of simple cases,
that's what would've happened under scheduled pressure. And we
would have gotten committed to a system that wasn't elegant at its
core. And then we never would have been able to really scale it to
address... Because you know how it is, if you address 99% of
problems that people have, it doesn't matter, that 1% is still
going to plague enough of their workflows, that is the difference
between they can adopt your tool or not. You've really got to be
complete. We would have never been complete enough. And I can say
that with confidence, knowing myself at the time, and also knowing
the culture that was around me. We would have never gone and done
the right thing, we would have hacked it, and we would have paid
the price. And it's not just a question of me not being up to the
challenge, just organizationally, we weren't at the right
place.
Rob Collie (00:17:48):
Office has really leaned into Power Query, it's a core part of the
Excel ribbon now, basically taken over the prime real estate. So I
think they're absolutely in on Power Query, and they're absolutely
in on the value that it brings. It's just that they're not the
right place to have invented it. In the same way that Office wasn't
the right place to invent DAX, it's just not what Office does,
Office does other things.
Rob Collie (00:18:12):
If I wanted to turn this around and say, the historical struggles
of the data side of the house has been that there haven't been
traditionally as good at user experiences as Office was. But that
gap is really closing, that has become an engineering discipline on
your side of the house that it really wasn't when I was there. I
used to describe Microsoft as there were user teams and engine
teams, and there was no such thing as a team that was both. Office
was the user team and the data platform, they built engines. But
the engine team couldn't build user experience and the user
experience team couldn't build engines. And so I think that's
changed a lot. And this is a great example of it.
Sid Jayadevan (00:18:48):
I mean to the point about Office and Excel, one of the things that
has been a little different with Power Query is that we've embraced
the open source model, perhaps a little bit more than for other
products like that. We have the Office team contributing very
heavily in our code base, not all aspects. It's the Power Query
team that drives the majority of changes, but the Excel team is
very, very involved. In fact, if you look at a lot of the
developments around Excel on the Mac, the Office team has
contributed very heavily to that. And so that ability to have other
teams come in and make changes and they've really been a poster
child for this on the Excel side, that has helped build Power Query
into more of an ecosystem even within Microsoft.
Rob Collie (00:19:38):
I didn't know that actually, I really had no idea that there were
Office engineers contributing code. I just sort of naively I guess,
assumed it was a one way street. You guys were sending them a build
update every now and then and they were ingesting it.
Sid Jayadevan (00:19:51):
And that is fundamentally how it operates because we do want
everyone within the larger Microsoft ecosystem to be benefiting
from the same enhancements, so there is a build that goes out every
month for Excel desktop. But we also have a lot of teams across
Microsoft who are contributing in a fairly big way.
Rob Collie (00:20:11):
In terms of, we talked about the M language and all of that. I just
told you the story about never getting the chance to do Data Merge
on the Office team. I'm really deeply curious about how the M
language got greenlit, how did the need for it get recognized and
bubbled up into something that they got resources. Because like I
just said, it's such a crazy thing. If you haven't experienced the
pain of the world, in terms of automatically munging and
transforming data, if you haven't experienced it, and most people
have it, even at Microsoft, most people haven't experienced that,
trying to convey that pain to other people is very, very difficult.
I look at Power Query as, look, this is something that the world
needed, not just like a demographic, this is something had improved
the world. And yet, I know from experience, it's very, very
difficult to explain to people who are already on board, what the
value is. Is there anything that we can talk about there?
Sid Jayadevan (00:21:06):
Without getting into all of the details, we went through a number
of iterations to get to where we are and where it started was with
some precursors to M, which were more about modeling, what we set
out to do. And there's a large number of people who contributed to
this. And so some of this predates some of our contributions. I've
been involved with the project on and off, in fact, I left at some
point and came back to it. And a lot of the seeds of the project
were in modeling related efforts. So ways of modeling your data,
modeling your relational data model. And as I guess, in hindsight,
could have been expected, folks started to realize that a lot of
what you needed to do to have a successful modeling environment was
enable transformations as a first class thing. And so at some
point, you had a language that was a little bit of data modeling,
and a little bit of transformations layered on top of that.
Sid Jayadevan (00:22:12):
And frankly, over time, we talked about how the query thing became
more and more important and became the essence of the product. The
data modeling side of things faded to some extent, and the focus
shifted towards transformation. And it shifted towards
transformation of all data. There was a period, not just at
Microsoft, but in the industry where very focused on data as not
necessarily a silo but homogeneous data stores. And when that
heterogeneity of data became a reality that no one was going to
change, the focus of tools like ours, and languages like M shifted
more towards that ability to embrace all kinds of data, wherever it
might live, of course, change the language and give us what we have
today.
Rob Collie (00:23:04):
This will show how old I am, there used to be a series of
commercials for Reese's Peanut Butter Cups, where two people would
be walking along, one of them will be carrying a chocolate bar and
one would be carrying an open jar of peanut butter for some
inexplicable reason. And they'd bump into each other and the two
would accidentally mix. And then they'd accuse each other, "You got
your chocolate in my peanut butter." "No, no, you got your peanut
butter on my chocolate." And then they would take a bite of it, go,
"Oh, my God, this is the best thing." It kind of has that feel to
it, doesn't it? The origins of M and Power Query, it's not like
there was this anticipated union with some DAX and what we think of
as the VertiPaq, Power Query data model. That wasn't a mission
statement from the beginning, it's just these two things ended up
going together super, super, super well, sort of an accidental
union. That's been my sense of it forever. Is that true?
Sid Jayadevan (00:23:53):
I think that's a very fair assessment. Miguel, what do you
think?
Miguel Llopis (00:23:59):
Yeah, I tend to agree. I was actually thinking about the previous
comment you made about the heterogeneous nature of the data space
right now. So yeah, really, when you talk about big data, it's not
really only about the volume of data, there's also the variety of
data, both in terms of the sources you connect to, the schemas they
have, the different keys on either side, and the need to use things
like fuzzy matching and mapping tables and whatnot.
Miguel Llopis (00:24:22):
And then lastly, is also about the velocity of the data, there's
some data that changes once a day, there's some data that changes
once a quarter, there's some data that changes multiple times per
second. And so providing tools for non technical users, which is
the vast majority of people in the world to actually be able to do
this efficiently and with ease and that even for somebody who can
do the hard thing, of course, who wants to do the hard thing if you
can do it much simple ways. I think that that was key to us and
just democratizing this whole problem space and of course, there's
a lot more that we can do.
Miguel Llopis (00:24:55):
And thanks, Rob for your list of suggestions from the team. We love
those and within our team, we do have this whole bucket of what we
call customer law, which is about, "Hey, give us a problem that
you've ever tried to solve with Power Query and it didn't actually
make the cut for you. And I will try and generalize that and give
you a feature out of it." That's how many of our existing
transforms came about.
Rob Collie (00:25:17):
It's just such a rich canvas. When you start from a language, you
have a lot of future flexibility in what you can do, it's awesome.
The heterogeneity thing again, I also really reacted to that, that
speaks to me. Another thing that shows my age, I grew up during the
peak Cold War between the US and Russia, or NATO and the Warsaw
Pact, whatever. And so I read a lot of Tom Clancy and I was one of
those kids.
Rob Collie (00:25:44):
Something that really strikes me from that is that the two
different philosophies of the 1970s, 1980s, Russian military
strategy versus the United States is, you see it in every morning
or every day of operations at a United States Air Force Base.
Everybody at the Air Force Base, gets out in a big long line, and
walks the entire length of the runway, picking up pebbles, and all
kinds of foreign objects from the runway, because if any of that
gets sucked into the intakes of these really sensitive airplanes
there's going to be hell to pay, it's going to break it, it's going
to go down. Whereas, the Russians built everything that they had,
at least in theory to eat mud.
Rob Collie (00:26:23):
I think the old world of BI was that 1980s American strategy. You
had to have this absolute clean room. It's ideal, frictionless
circumstances in order for everything to work right, which is, of
course, it's completely unrealistic. The real world is dirty, it is
noisy. There's chickens running across the runway, it's not just
pebbles. Is there even a runway? And this wave of Microsoft tools,
the Power BI beating heart, which Power Query is part of it, I
mean, it is built for that real world messy, dirty reality. It's
not the kind of thing that you imagine when you're sitting around
in a whiteboard doing computer science. And when computer science
can meet that kind of reality and perform, it's really something to
behold. It's just a whole new era, isn't it?
Sid Jayadevan (00:27:16):
Yeah, I couldn't agree more. It's messy and dealing with that
messiness is still very much a work in progress. But that's the
thing we're trying to embrace, that messiness that isn't going away
anytime soon.
Rob Collie (00:27:29):
It only gets messier, even our company.
Miguel Llopis (00:27:31):
But at the same time tools get better and smarter. So how can we
actually make it so that it's even easier and easier for you to do
these things with Power Query in this case?
Rob Collie (00:27:41):
Yeah. Some of the things that you can start to do with machine
learning and AI to write the code for them, there's some scary
stuff that can be done there. A no column by example is sort of the
most straightforward poster child for that kind of thing. I do want
to at least make one joke with you, which is actually the truth but
it's funny, is that back when I would teach classes, we still teach
a lot of classes, but they don't let me teach them anymore, because
I'm not as good as the people on our team. But whenever I bring up
M, and I would show people the code, and so we'd using Power Query
a little bit, and then I'd show them the code that it was
generating. And then I would zoom in on that code. And I'd say,
"And this word here at the beginning, tells you everything you need
to know about where this thing came from." The first word of every
Power Query script being the word let, I just talked about like the
messy real world reality.
Rob Collie (00:28:31):
But the word let at the beginning of every Power Query script,
tells you this came from the ivory tower. I look at the class and
say, "It's almost like a philosopher smoking a pipe, who then says
to you, 'Suppose.'" What if we pause it, and then the script
starts? Like I said, I admire what M can do. But the M language
itself doesn't speak to me in its raw form. I look at it and I kind
of want nothing to do personally with editing it. A lot of people,
especially on our team, they do, I'm just one of those people
that's like, for whatever reason, I was willing and able to learn
DAX and I typically don't learn stuff, I don't learn new tool sets,
I don't learn new languages. The fact that I learned DAX is really
an outlier for me. I'll never learn M, not in its raw form. I'm a
button pusher, dyed in the wool.
Sid Jayadevan (00:29:25):
And then we want to cater to all the constituencies, the folks on
your team who wanted at the end, we want to make that possible. And
for the many folks who would rather press the buttons, for that we
have the visual interface.
Rob Collie (00:29:41):
Do you have those personas behind the scenes where you talk about
the person who only wants to push buttons, you have the
unsophisticated user of M persona. Can we just name it Rob and I'll
give you a picture of a me going...
Miguel Llopis (00:29:53):
Actually, I call dibs on that one because I'm that kind of person
as well. And that's what I would push for most of the time.
Rob Collie (00:30:00):
Damn it. And you being on the team, you got an inside track to be
the persona. All right. Well, listen, I'm waiting in the wings.
I'll be your understudy. So how much of the two of you got an
exposure to the next part of the chain, which is, do you sit around
building Power BI models? Do you write DAX? Do you build data
models?
Miguel Llopis (00:30:20):
Yeah, big time. I mean, we use our tools, the tools that we build,
we use them internally for for example, understanding how users are
using our products or understanding our backgrounds and our feature
tracking report, you name it. Not to talk about personal projects,
I do have my personal projects with Power Query and Power BI as
well for non work related stuff. And that's actually, in my
experience of this is, to me has helped me the most actually
understand internalize all of the end user pain points around this
area and actually push the tool to actually become better. And I
know Sid does quite a bit of this as well.
Sid Jayadevan (00:30:57):
Yeah. The entire team does a large amount of eating our own dog
food, dog food, and you've heard myself term for this. That's
always been a very large part of what we've done. It's not just
about using Power Query, it's about using in the context of all of
the things that Power Query is hosted within. And so Power BI, of
course, and Excel and Power Apps, and aspects of Azure, we try to
ensure that we're experiencing the end to end experience as much as
possible.
Rob Collie (00:31:29):
It's just a complete divergence from the path we've been on. But I
want to at least mention to you before I forget that, in the past
seven days, last two work weeks, off and on I've been teaching a
little bit of Power Query to a high school football coach. We're
just kind of messing around for the moment with a Power BI through
a pro bono project. It's just sort of a passion project of mine. I
got to tell you, it's fun. This guy's eating it up. He's loving it.
I'm showing him how to add error checking and things like that for
when there's the temp Excel file still in the folder that he's
trying to load from, that's going to mess things up. Well, you
could filter that out and everything. And yeah, he sponging it up.
It's just cool to see it. It's all these unexpected places, you see
these tools end up being used.
Rob Collie (00:32:14):
So both of you seem to have a lot of opportunity to sort of drive
the race car that you build. And that was not something that I
really felt like I had much chance to do when I was at Microsoft.
It's like, we built race cars, we have no idea what it feels like
to sit behind the wheel. And so it's always surprising to people
that with whatever tool I've been working on, the customers were
better at using it than I was. It's nice that there's a little bit
more of a culture now of using the tools even for personal use.
Personal use is fantastic, there's nothing better than personal
use.
Sid Jayadevan (00:32:46):
Absolutely. As Miguel mentioned earlier, it's for both hobbyist
projects, pet personal projects, as well as internal day to day
work. Love using it for all of those things. And Miguel in
particular has I think some soccer things that he probably use it
for, but I'll let him speak to that.
Miguel Llopis (00:33:10):
Yeah, definitely soccer as well as a bunch of other things I
wouldn't name. Yes, quite a few personal projects.
Rob Collie (00:33:17):
It's really nice of you to call it soccer for us. I'm sure you
don't call it soccer with your fellow soccer fans.
Miguel Llopis (00:33:24):
Yeah. You mean with our football fans?
Rob Collie (00:33:24):
Yeah. What are some of the craziest things you've seen? I'm sure
that you've got just some really crazy stories of things that
you've seen customers doing with Power Query that you never would
have expected? Anything like that come to mind?
Miguel Llopis (00:33:37):
Many things. So I guess could take crazy in a couple of dimensions.
One could be unrealistic expectations on the tool or the
technology. The other one could be tremendously complex projects.
So I'll actually head down the second path.
Rob Collie (00:33:53):
Sure. Let's do that.
Miguel Llopis (00:33:54):
I think the biggest Excel workbook with PQ queries I've ever seen,
had probably about 280, 290 queries on it. I'm glad we introduced
query groups as a feature because that person will be there in the
world without them. But even there, it's a pretty heavy to maintain
project.
Rob Collie (00:34:13):
And the dependency.
Miguel Llopis (00:34:14):
Yeah, I was going to say understand query dependencies. So you do
have some support for that in Excel today with query dependencies.
We're working on way more interactive, highly visual experiences
that eventually will make their way into Excel. But as of now
available in the Power Query online experiences with what we call
the Diagram View, which is currently in public preview.
Sid Jayadevan (00:34:35):
290 queries.
Miguel Llopis (00:34:36):
Yep. And they're all legit, that we literally sat together and say,
"Let's simplify this." And actually, yeah, it could have combined a
few things, but it actually made sense the way he had it
organized.
Rob Collie (00:34:47):
And is the endpoint of that data landing in Excel?
Miguel Llopis (00:34:52):
Yes, it was inside an Excel workbook.
Rob Collie (00:34:53):
Wow. Wow. You don't have any examples of people using Power Query
or data flows to automate their home? For example, I have a friend
of mine right now, who is setting up using Power Automate, he's
setting up where if he gets a text notification from a certain
Internet of Things system, it will go in and adjust the temperature
gauge, the thermostat, turn on heaters, turn on humidifiers, things
like that. It's a terrarium, he needs to maintain the balance in
this biosphere that he's built. And he's got monitors in there, but
all they'll send them or text messages. That's all he can get. But
he's like, "No problem, I'll eat those text messages and feed him
into the power platform. And next thing, we're adjusting
temperatures and humidity and all that kind of stuff." I bet
there's a lot of stuff out there like that, it's data
transformation but analysis isn't the endpoint. It's being used for
something else.
Sid Jayadevan (00:35:54):
We're blown away by a lot of the creativity, seen a lot of these
very self regenerative programs that people have created, where the
queries adapt and do all kinds of things. It's a ton of
creativity.
Rob Collie (00:36:11):
One scenario, and now we're doing the program manager feature
design thing. And one scenario that I've wondered about for a while
is failures in a Power Query, the error handling. Using the moment
of error, harnessing that, and activating a human workflow to
address it. The way you're nodding, this is not the first time this
idea has come up, right?
Miguel Llopis (00:36:38):
Yeah, I was wondering if I had mentioned some of that stuff to you.
Because today, within the Power Query Editor experiences, you do
get some help with data profiling features, you understand
duplicate values, you understand errors. To some degree at least
within the data in the preview that was for you to run that over
the entire data set. But nothing really helps you with, after you
save that, and you say, "Yeah, refresh this thing every day at 8:00
AM." With understanding if that still is correct, if you get a new
outlier value, if you get a new duplicate value, and you get some
errors around that. That's one of the areas that we're looking at.
And it goes back to the thing we were talking about earlier about,
how can we further simplify this tool and make it more productive
for the real users of it on a day to day basis. And this is clearly
one of those areas. I mean, if you're putting together a report or
a dashboard for your boss, you want to make sure that they don't
start looking at the wrong data without you even knowing.
Rob Collie (00:37:32):
Oftentimes, it manifests in some very sinister ways. Like if a data
source succeeds in refresh, but it feeds you back nothing but
zeros. [crosstalk 00:37:42] There's no runtime error. And then, of
course, if you saw a report with nothing but zeros on it, you'd
notice, you say, "Oh, clearly, this thing's dead." But if those
zeros are only one leg of a five leg platform that makes a single
metric, the answers you get on your report can still be
credible.
Miguel Llopis (00:38:00):
Yes, that is a problem.
Rob Collie (00:38:03):
And I'm speaking from experience, I've been burned by exactly this
sort of thing in the past. Even when there's a runtime error, it's
almost always a human being that has to go do something. If a
duplicate key comes in, that wasn't there before, what do I do
about that? I have to-
Miguel Llopis (00:38:19):
Would it be nice if we just fix it for you? Or if we maybe ask you,
"We saw an issue and this is what we think you might want to do."
And we give you a couple of options. And maybe you don't even have
to go to the tool, maybe there's a quick text message you get,
maybe somebody is giving you a phone call while you're driving,
maybe it's an email that comes in and just with a couple of clicks,
you can just get it fixed.
Rob Collie (00:38:41):
This are all good ideas. I like this. This sounds promising.
Sid Jayadevan (00:38:44):
One thing that we recently added in this space was integration with
power automate. So that's more on the data flow side. And it's
early days for that, but we've already seen some very interesting
solutions. One of the things you can now do is have your data flow
include a bunch of these reports for issues that you mentioned, you
could perhaps partition off the errors or have a bunch of litmus
test queries that check the data quality. And if those queries
start yielding results, you can fire a power automate that can
engage whatever workflow makes the most sense for you. Whether it's
sending an email, whether it's writing something out somewhere for
someone to take action, going all the way to sending someone a text
message. All of those things are possible. They're perhaps not as
frictionless and out of the box as they could be, but we're making
some of those things more possible.
Rob Collie (00:39:42):
I think that problem of merging the automation with human like
referees of the occasional error is probably as ambitious of a
problem to address as Power Query was originally. I've got a lot of
respect for that problem, placing myself in your shoes. Might not
be that quite that ambitious, but it's a large problem. It's a
product level problem to solve as opposed to a feature. Every now
and then like, I get some data where someone keyed in an
exclamation point instead of a one, because their shift key was
down, and all hell breaks loose over that exclamation point.
Rob Collie (00:40:24):
You got a hard job, the error tracking in your system, it's many
levels deep. We all know the experience of you get the error, and
the top 11 errors all say exactly the same thing. And you scroll
through the list to get to the one at the bottom that tells you
hopefully, what really happened before the downstream errors
happened. It's hard to bubble up the right error to the right
person at the right time when almost by definition, you don't know,
you can't anticipate what this error is going to be, you have no
idea what's going to come in. So I recognize this as sort of a
frontier for you, but I do not mean to trivialize it at all. It's
only an improvement. It's not like you need to do this otherwise,
everything you've done is... No, you can stop today completely and
Power Query is arguably complete, you just have so many places
where you could-
Miguel Llopis (00:41:16):
Go and tell that to Satya, we want to still keep our jobs. Got to
find new challenges.
Rob Collie (00:41:20):
Well, next time I talk to Satya, next time he calls me up for
advice. Yeah, I think it would be a shame if you did stop. It's a
compliment to what you've got, that if you stopped today, it's
already well past amazing. I'd say to students and clients that
there are two engines at Microsoft, two data engines in particular,
that all of Microsoft's competitors wish they had it instead. What
are you going to call the DAX and data model VertiPaq. Microsoft is
not very good at naming, I don't know if you all know that. And
then the other one is the M engine, the Power Query engine, which
also by the way, goes nameless in all of your products. It's just
get data or import or whatever now, getting transformed.
Miguel Llopis (00:42:03):
It's the M engine and Power Query is the experience.
Rob Collie (00:42:06):
These two engines, wherever you call them, they belong in the
software Hall of Fame. I believe that. And this is a very vicious
critic of software, who's talking to you right now. I hate
software. And these two things, they demand your respect, it's got
to feel good to have been involved in something like that from such
an early stage. It's got to be one of the most gratifying sorts of
experiences for a software engineer because most of the time, it's
not like that.
Miguel Llopis (00:42:33):
This is such a tough interview, Rob.
Rob Collie (00:42:36):
To make you guys feel all gushy about yourselves.
Miguel Llopis (00:42:43):
Yeah. Don't know what to say [crosstalk 00:42:44].
Sid Jayadevan (00:42:43):
That's very kind of you.
Rob Collie (00:42:43):
Oh, come on, you've lived it. Right? You've probably also lived as
software engineers, you probably lived the other kind of project
too. There's all kinds of dead ends in software that you can chase
them for years.
Sid Jayadevan (00:42:54):
I think one thing that's been a big differentiator with this one
is, so Miguel and I are here today, but there's a team that has
stuck together over an extended period of time. And it's the most
fun I've had in my time at Microsoft. I'm very, very fortunate to
work with those folks. For a problem like this, there is a kind of
continuity that becomes necessary to... You talk about the
iteration and needing to keep going. And we have a lot of work
ahead of us.
Sid Jayadevan (00:43:24):
But the thing that has made this easy and fun, at least from my
point of view is the team has been phenomenal. You tend to have a
lot of churn on teams, and you go through phases, and people come
and go. But this has been one where there's a set of fun. And I'm
not talking about a handful of folks, it's probably a few handfuls
of folks who really pushed on this over many, many years. I think
that's one thing that's been a little different vis-a-vis a lot of
other projects, that there's been a set of folks who have stuck
with it and have been incredibly passionate about it. And that's
been a big part of Power Query.
Miguel Llopis (00:44:02):
Completely agree.
Rob Collie (00:44:03):
Some products really require that kind of continuity in order to
continue being successful. Excel, by the way is one of them. I
think Excel, I don't really know what it's like today, but when I
was there, there was pretty healthy turnover every release on the
Excel team. And the developers, the engineers, the actual writing
the code, they had a bit more continuity, actually quite a bit more
than the program managers. It was every two years, the school bus
would drive up, all the program managers we get on, it would leave,
new school bus arrives with younger program managers and would drop
them off. I got off that bus one day and enjoying the Excel team.
And the engineers on the team were just like, "Ah, the new
youngsters, we got to train these people now too."
Rob Collie (00:44:55):
It was a year and a half of working on Excel before I stopped
coming up with feature ideas, like wouldn't it be cool if Excel
could do this. It was a year and a half before I stopped coming up
with ideas like that, where they'd look at me and say, "Yeah, we
already have that." Honestly, I think that culture, that continuity
was enforced more by a handful on the Excel team when I was there,
they were keepers of the flame, if you will. And there was like one
on the program management team, half a dozen on the dev team.
Sid Jayadevan (00:45:27):
And you have a lot of those projects where you'll have one or two
keepers of the flame. And I think what's been unusual with Power
Query, at least compared to other projects I've been on is that
there have been many, many keepers of the flame. And of course, you
want fresh ideas. So you want people to be coming in and bringing
those ideas, and we've had a lot of that as well. And so there's
been keepers of the flame, there have been challengers of the flame
in a very good way. So we've had that mix. But there has been a lot
of good cohesion.
Rob Collie (00:45:59):
It sounds a good title for a Kickstarter funded board game,
challengers of the flame. I'll tell you what, well, you all get
equal rights. We'll call it a common intellectual property, that
name. I'm here by seizing 1/3 ownership in Challengers of the
Flame, LLC.
Sid Jayadevan (00:46:19):
What was the board game at the end of Office Space, the jump to
conclusion board game?
Rob Collie (00:46:28):
I don't actually remember, I've seen that movie so many times. Now,
I've got an excuse to go watch it again. Tell my wife, "Listen,
this is important. This is for work."
Sid Jayadevan (00:46:38):
That was our quandary, what do you name the thing?
Rob Collie (00:46:43):
So how much commonality is there, I'm assuming a lot, between data
flows and the version of the M engine that lives in Power BI?
Miguel Llopis (00:46:55):
Basically is the same engine. So data flows, the way I like to talk
about this is layers of the onion. So if you think about the M
engine as the core of the onion, then the next wrapper around that
is the Power Query experience that allows you to create queries
that run in M. Outer layer on top of that is really the data flows,
which really automate and orchestrate many different sets Power
Query projects that were defined with a Power Query experience to
generate M that runs.
Miguel Llopis (00:47:23):
So whereas you could have a data flow that maybe brings say, your
customers data, your customers table. Or your customers entity, you
may have another data flow that connects to that customers entity
and then maybe does a bunch of additional Power Query and M query
transformations to do your customers who are most likely to churn.
And it's the orchestration of that whenever that customers table
gets refreshed, cascade refresh everything else that depends on it.
That is what data flows are.
Rob Collie (00:47:55):
That makes sense to me. One of the challenges that I know that
Power Query faces is that at tremendous scale, when the data is
just gigantic volumes, the elapsed time of a query can get up
there. And it's just an optimization thing. It's almost like the
ideal software problem to have as engineers. How much progress has
been made over the years? I haven't really been paying much
attention to it. I just remember from the very early days, people
saying, "Okay, it's great, but we can't use it for the 500 million
row data set, going through a Power Query, just takes too long."
Have there been any strides made? Again, I'm really sympathetic to
this, it's a really hard problem. Power Query has to process every
single row, it can't do the things like the VertiPaq Engine does
where it sort of groups rows into clusters and treats them as one
band of rows, you don't get those really nice columnar in memory
tricks when you're performing transformations. So you're kind of up
against physics in a way.
Miguel Llopis (00:48:56):
Yeah, great point. So there's actually two avenues we can take to
answer that question. I'm going to talk about both, I'll just call
them out. And then I'll answer the easy one and I'll let Sid answer
the hardest one. One is about increasing the scale of what you can
process with Power Query. And of course, you need to do that. But
on the other extreme, there's also the make it clear to the end
user clicking those buttons as Drobo usually does, that there is a
problem, and so that they can correct that problem before it
actually becomes the root cause for things that are many, many
steps further down the pipe.
Miguel Llopis (00:49:29):
And so on this area on making things more clear to users, we're
actually introducing quite a few new features. We just announced
something called the step folding indicators. So it's a feature we
recently launched inside Power Query online, inside data flows that
as you connect to a data source, let's say, SQL Server, and you
connect to the customers table, and then you apply a filter to
maybe say exclude customers in the US, then you get your filter
versus they will actually give you a tick next to it to say, "This
has actually been pushed down to SQL because SQL can run filters
like this one." Now you go to a different operation that does not
fold. As a new step, it will actually immediately tell you, "Hey,
this thing is no longer when I run in SQL, we're running it
locally, we're compensating here, this is what's going to happen.
In the extreme, this might actually cause you issues, click here to
learn more, learn some best practices for how you could do things
be different." And many other features that we're working on there.
This is the most basic way to tell the most basic end user about,
"Hey, there might be a problem," is like the engine in your car
dashboard.
Miguel Llopis (00:50:33):
We're also looking at things like query plans, more detail, deeper
information for slightly more advanced users who actually
understand the underlying SQL, the underlying code behind it, to go
reason about okay, where are actually things going south? How do I
understand this better? So I answered the easy part of the
question, which is how do we make it clear that there's a problem?
Now Sid can talk about the exciting stuff we're doing on the
scale.
Sid Jayadevan (00:50:56):
And I'll have, I guess, an unsatisfying, cryptic answer to that,
because it's probably our largest area of investment right now. But
we don't have anything that we can really announce yet, but it's
something we're working on. And that should come as no surprise,
because we hear a lot of feedback in the space. And there's a lot
going on at Microsoft and in the industry in this space around
making compute more available, even if your data lives somewhere
where there isn't compute. And so that's something that we will
definitely be investing in and that we're actively working on.
Rob Collie (00:51:32):
I actually find that to be a very satisfying answer. Because
honestly, all I want to know is that A, people are working on it,
and B, there's optimism, there's still improvements to be made.
That's all I really need to know. I mean, there's a nerd part of
me, it's like, "Okay, come on. How do we do it?" But even then
probably, if I got too close to it, I probably go, "Oh, yeah, now
we're on board." I'm really just interested in the fact that it's
going to happen. Have either of you seen all of these, a meme, but
it's a YouTube meme, a format that's Hitler losing his cool,
screaming at his generals in the bunker, and the subtitles have
been replaced with something completely different? You seen
these?
Sid Jayadevan (00:52:15):
Yep, seen those.
Miguel Llopis (00:52:16):
I've not, I'm too young for that.
Rob Collie (00:52:18):
Oh, really? YouTube didn't go and record Hitler in his bunker. I
don't know if you know that YouTube is relatively recent invention
that probably has happened in your lifetime.
Miguel Llopis (00:52:30):
But I just don't have time to watch it. There's just so many soccer
games to go watch. Sorry, football games.
Rob Collie (00:52:36):
Football games. I agree. So I made one of those a long time ago,
making fun of Tableau. And in terms of the first three months of
its existence it's probably the video that's been watched the most
of all the things I've ever done on YouTube. There's a part at the
end where he mutters under his breath, he turns to look at his
subordinates and say something like, "And if you think we're paying
for those Alteryx license, you better be sprucing up your
LinkedIn." So as the Power Query folks, I made that joke for
you.
Miguel Llopis (00:53:09):
[crosstalk 00:53:09] Geek.
Rob Collie (00:53:10):
There's a lot of inside baseball in that video. Even the Tableau
employees that have seen it, look at me say, "Okay, that actually
was pretty funny." What's next? What am I not asked about? Such an
exciting space with so many opportunities.
Sid Jayadevan (00:53:26):
We have a whole new interface coming in terms of a more
diagrammatic visual representation of the queries. Miguel may have
alluded to this before. That's a big one, changes the profile of
the product quite a bit. We're not taking anything away. And we
don't want to make things tricky for people who are familiar with
the existing interface so it's strictly additive. But that's one
we're really excited about. We've tried a few things there. It is a
new interface, but we're also using it as a way to address some of
the feedback that people had, just surround how you track
relationships and make it a little more fluid to chain things
together. So that's one that I think the team's very excited about.
So we're going to push that one out pretty soon. And that's already
in preview, so you can go play with it.
Rob Collie (00:54:19):
I think I should as one of the absolute sloppiest designers of
Power Query scripts in the world. If you ever want examples of
really, really, really ugly, I can't believe this Rube Goldberg
sequence that someone's written, all you need to do is just ask me
for anything that I've done. I've got stuff now that I'm just like,
"Okay." I've got four queries that are basically one to one
linearly feeding into each other, that their only purpose is to
feed the next one. And they're not even sorted in the proper order
in the query pane. Even I don't remember which one is the root, I
don't remember which one is the first one in the assembly line.
Every time I go back and look at I have to re sort of trace, trace,
trace, trace. Like, "Okay, that's right. That's how this thing
works." You want ugly, I got you covered.
Miguel Llopis (00:55:06):
Yeah, we would love to see those and see what we're reinvesting and
actually behaves against that. So yeah, I just sent you a link on
the chat window for the Diagram View, would love your feedback on
that. Let's you share feedback about every other area. And again,
we will take it and we'll generalize it, and we'll make it into
something that improves the product.
Rob Collie (00:55:26):
And this poor high school football coach, his first exposure to
Power Query is with exactly the example I just told you about. He
has no idea how much better it can be.
Sid Jayadevan (00:55:37):
That's cool. You know that ad hoc style of using the product where
you don't necessarily architect how your queries come together, in
some ways that we want to cater to that even more, we don't want to
go in the direction of some very formal modeling exercise. And we
want to keep enabling that style of using the product. And so this
tool is not meant to police any of that, it's more just to help you
understand it better. I have some mashups where I have so many
queries, and they could have been designed way, way better. And
over the passage of time, a few months later, I look at the thing,
and I have no clue what I did, back when I created it. So this is
meant to help with those sorts of things. It's more to help you
decipher what others did, and sometimes help you decipher what you
did.
Rob Collie (00:56:29):
A previous version of you is almost just as inscrutable as another
person's work. I can be away from it for two days and come back and
go, "What was I doing here?" Same is true, by the way, with
spreadsheets, traditional spreadsheets, non DAX spreadsheets. I can
generally go back to one of my DAX models, and pretty quickly get
back into the personality of what I was doing there. But the old
spreadsheets, using just the Excel formula language, and really
pushing it to its limit, oh my gosh, those things. I'm always
impressed at how smart I must have been in the past to have put one
of those together. The current version of me always feels dumber
than whatever the version was, that was able to do what I did in
Excel back in the day. So, connectors?
Miguel Llopis (00:57:14):
There's a roadmap on connectors, there's some... Overall, our
strategy with connectors is one where we have the custom connectors
as the key that empowers anyone to build connectors. This could be
you building your own connector for whatever you're trying to do.
Or this could be an actual ISB company who owns an underlying data
source backend, who actually wants to provide connectivity to that
from Power BI from Excel. And we do have certification programs
around that.
Miguel Llopis (00:57:42):
So really, there's some new connectors coming out of our team,
there isn't much in terms of net new connectors. There's a bunch of
connectors that our team owns from the early days when we didn't
have this way to actually extend our SDK. And there is where you
see most of our investments on making sure that X connector now can
use this certain new feature that the underlying back end added or
that customers are demanding now, more than others.
Miguel Llopis (00:58:09):
So I wouldn't say there's much in terms of excitement there on
connectors to cover at this level, is very point wise feature level
things on existing stuff, does have experiences, Power Query
experiences. So yeah, everything you want to talk about regarding
diagram views, or more by example, like experiences infuse AI into
the product. On the data flows prompt, we talked a little bit about
the refresh base data quality and monitoring stuff, which is
actually not formally in our public roadmap. But just because I
think the discussion we have, it just screams at, hey, this is
actually a useful area that I think is just okay. Those are kind of
the big pillars.
Rob Collie (00:58:47):
There was something interesting that as you were talking about the
connectors. Of course, Microsoft cannot write connectors for
everything, the list of everything is damn near infinite. And
reasonable percentage of the time, the systems that I wish there
was a connector for is a non Microsoft product that, at best is
sort of neutral towards integration with Microsoft technology and
other times it's openly hostile to it. And so, at our company, of
course, we're Microsoft at its core, we use a lot more Microsoft
than other stuff.
Miguel Llopis (00:59:21):
Come on, you don't need to apologize, what else are you using?
Rob Collie (00:59:24):
We use a lot of things. And so like Salesforce is our CRM, and in
terms of workflow, it's one of our most central systems. Certainly
not our only system. Right off the bat, we've got an alien right in
the middle of the story. And we park a lot of data for our own
internal BI. And by the way, our internal BI is very, very, very
sophisticated today, it's not a stretch to say that we simply could
not survive without it. It's not like our business operates and
then we use BI to optimize it, it is life support. It's the oxygen
supply, it is really, really, really, really critical to us and our
business model.
Rob Collie (01:00:07):
So we use another third party product called Stitch, which you've
probably heard of that they've written a bunch of connectors
essentially, will then dump data into various endpoints that they
know about. And so we get a lot of data out of our core systems
into the Azure Data Warehouse, so not just Azure SQL, via Stitch,
and then Power Query kicks in. It's not like, it just lands there.
Gosh, our Google AdWords data, we're grabbing that from Stitch into
Azure Data Warehouse.
Rob Collie (01:00:40):
And then because the data that's grabbed... It's so weird, guys.
AdWords data is like day to date running totals. So every time you
take a snapshot of it, it's like you had three clicks last hour,
now you have seven clicks. Does that mean you have 10 clicks today?
No, you have seven. So we've got Power Query that is doing a group
by and taking the max, or grouping by the most recent timestamp on
that day, because Stitch doesn't do anything magical for us, all it
does is just raw data dump from one place to another.
Rob Collie (01:01:14):
It's really neat like this, going back to that metaphor of, you
want your jet plane built for the reality of the world, with all
kinds of noise, and all kinds of variety, and all kinds of
unpredictable things. And even without dedicated Power BI
connectors, for a lot of our systems, it doesn't matter. We're
going to get that data. And the fact that the Microsoft tools
participate in this larger ecosystem.
Rob Collie (01:01:39):
I've always been really ambitious about defining sort of the new
template for what consulting firms should look like in this new
world. It sounds like a cliche, but 11 years ago, I'm sitting in my
office one day in Cleveland, and I was using Power Pivot and it
just hit me like a thunderbolt. I'd suddenly done something that
was not possible. And I'd done it in a space of like an hour that
had taken weeks and weeks in the previous world. And I could kind
of see that the world was going to change, that the size and
duration of a typical project was going to shrink dramatically,
still have the same amount of impact as the big long project, in
fact, actually better. It's going to have more impact because the
short projects means that you're actually holding people's
attention long enough to iterate and get the real results that the
longer projects never got to because people got too exhausted, and
just called it done even when it wasn't. So the size of the average
project was going to compress dramatically.
Rob Collie (01:02:37):
So the utilization model for a traditional consulting firm, which
has long been like park a handful of people on a six month minimum
project. That whole business model was going to die. Now I thought
it was going to happen a lot faster than it has, it still hasn't
happened, really. We've reached the point where the world is
intellectually, agreed that citizen developer model is primary, and
is important. But for a long time, that was still heresy. So we've
reached the point, we've intellectually accepted that.
Rob Collie (01:03:06):
But that doesn't mean that the real on the ground muscle memory has
changed. This has been the mission for 11 years, go and build this
firm. However, we never took any investment. It's not like really
people who would ever want to fund a consulting startup anyway.
Angel investors and venture capitalists, they're always looking for
tremendous intellectual property. They don't want people involved.
Consulting firm has too many people, it's too good of a deal for
too many people. They want something where you can essentially
charge rent when it's done. So we wouldn't have really been able to
attract that kind of funding anyway. Plus, they would have ruined
it if we had taken their money.
Rob Collie (01:03:40):
So we've organically grown, all of our hires, and all of our growth
has been funded out of revenue, which makes it slow or slower
anyway. But someone told me something a long time ago, which is
like, "Let me tell you about my 10 year overnight success. It's not
overnight, but it has been 10 years." And I'll be completely honest
with you, I think that there's really no limit to how large we can
be. It's been a long road, but the way we operate is to run with
these tools as fast and as impactfully as they allow. So we're
great for the customer. We're great for the customer in a way that
I don't think really any other Microsoft partner is. It's a very
hard business model. It's obviously the thing that the customer
needs. But it's a hard business model to sustain which by the way,
we've used the Microsoft platform to make it work internally.
Miguel Llopis (01:04:32):
So you're ceiling, your bottleneck is actually going to be at the
very least talent acquisition so that you can scale to more people
as you scale to more customers.
Rob Collie (01:04:41):
Yeah. I learned a lot of things at Microsoft about interviewing
too. And we're using a lot of systems, we have a lot of actual both
software and delegation to assistants and things like that, that
allow us to scale. The hard lessons that I learned about
interviewing at Microsoft, we apply that at national scale. So we
have like a 2% offer rate for our candidates, and we get to pick
the best of the best. So I actually don't think we have a supply
bottleneck either.
Sid Jayadevan (01:05:13):
That's a lot of interviewing.
Miguel Llopis (01:05:15):
We're hiring. So if you have any pointers, we appreciate them too,
both engineering as well as PM.
Rob Collie (01:05:21):
Well, I don't know, that's the kind of consulting fee that I'm
going to have to [crosstalk 01:05:27]. I'm going to have to have
McKinsey white label me, so that I can charge Microsoft the
millions of dollars that they would pay McKinsey, but they would
never pay Rob Collie. Definitely, yeah.
Sid Jayadevan (01:05:42):
Very interesting. Fascinating story. I mean, I've watched from
afar, but I didn't know many of these details. And so yeah, very
helpful.
Rob Collie (01:05:52):
The engineering mindset, I think both of you actually would be
really sincerely kind of interested and fascinated by all the
things that we've developed and found about how to incentivize the
right things with our consultants, for our clients. And there's
something almost, it's not patentable, it's not protectable. But
there is something in the same way that software has intellectual
property, our system are all up system, software, people workflow,
all of that. I'm pretty sure this is the only instance in the world
like it of a company that operates like this. We've had to discover
how to do this rather than there was no template to follow. You
guys both know how exciting that kind of problem is. The same sorts
of things that get you geeked up about going to work at Microsoft
to solve that performance problem or whatever we're talking about,
that same itch being scratched, but in a different plane. No, you
can't have any of my people, Miguel.
Miguel Llopis (01:06:48):
Good to know.
Sid Jayadevan (01:06:50):
And have you been geographically distributed throughout?
Rob Collie (01:06:53):
Yeah. It was really like 2015 was the first time that I realized I
was bringing in the demand for work was exceeding my personal
capacity to address it. And it was just me running the website and
doing the trainings and doing the consulting up until that point,
basically. And I knew that I didn't have time to train up another
consultant that could do the work that I was doing, I needed to
find someone who was basically ready today. And I knew that I
wasn't going to be able to do that, if I was just like, "Let's just
find someone in the vicinity of where I currently live."
Rob Collie (01:07:30):
So the very, very, very first candidates, the very, very first
interviewing that we did as a company was remote. And the first few
people to pass this interview, which again, was designed 100% from
my experience interviewing program managers at Microsoft, and
especially the fact that I've done it the wrong way for years and
then I did it sort of the right way for the last third of my
career. The first people to pass it were in all over the country.
They were in Oregon, they were in Iowa, they were in Iowa, they
were in Alabama, and I was in Ohio.
Rob Collie (01:08:05):
So it's actually something that's really interesting. And I'm
almost a little bit bummed about the fact that COVID has rewired
everybody this way. Because for a while, I think we'll still have
this advantage for a long time in a way, but especially given the
nature of the consulting industry that we're in, which is still
very in person. When you can hire from any geography, you can
afford to be a lot more selective. You just have a bigger
denominator. If you want to hold a really, really high quality bar
and clear it, you can do that, if you're not geocentric. So in a
way, we were kind of forced into behaving optimally from the
beginning. It wasn't some fiendish genius plan, like, "Oh, we will
be geo distributed, and we will therefore get the best talent, and
bahaha." It wasn't like that at all. It's just like, "I need a
person and there's no way that I'm going to find one in Cleveland."
And all followed from there. So it's really insane. Heck of a
journey.
Sid Jayadevan (01:09:09):
It's an amazing success story. And sounds like you guys feel like
you're just getting started.
Rob Collie (01:09:16):
And trust me, plenty of failures along the way. I found out
somewhere along the way that I actually am not good at running a
business. And it's like you find out you're not good at driving a
boat and the way you found out is I just crashed it onto a reef. I
did that. I almost killed my own baby at one point. And I had to
realize that I needed to share the steering wheel. And so the guy
whose podcast went live this week Kellan, he's the architect of
almost all of these good things I've been talking about. My vision
and the things that I wanted to have happen, never ever would have
met reality without Kellan to bring them to life.
Rob Collie (01:09:54):
And he was one of the first people to pass the interview. I hired
him as a consultant originally. I had no idea that I was hiring my
other half at the time. It took a long time for me to come to terms
with that. So hard road, lots of humbling, really humbling
experiences. Well, guys, I'm sincerely grateful to be able to grab
a couple hours of your time. Thanks for doing it.
Miguel Llopis (01:10:16):
Thanks.
Sid Jayadevan (01:10:16):
Thanks for having us.
Announcer (01:10:18):
Thanks for listening to the Raw Data by P3 Podcast. Find out what
the experts at P3 can do for your business. Go to
powerpivotpro.com. Interested in becoming a guest on the show,
email lukep@powerpivotpro.com. Have a data day.