Oct 13, 2020
In our inaugural episode, Rob and Tom cover all this and more:
Episode Transcript:
Rob Collie (00:00):
Okay, welcome. So this is our inaugural episode and our first guest
is also going to be our ongoing 75% co-host; 75% only because his
schedule won't always allow him to join us, but he's Tom, or Thomas
LaRock. Now, I've known Tom for over a decade and he's just a
fantastic human.
Rob Collie (00:22):
I hope you're going to find that to be a theme here on Raw Data
with our guests and various participants. But data-wise, the thing
I've always found so compelling about Tom, his crossover status.
Now, here's a guy who branded himself publicly as SQL rockstar for
years, and he kind of still does. And you'd think that pretty much
cements him as a storage professional.
Rob Collie (00:44):
But basically the whole time I've known him, he's been trumpeting
the idea that analytics are the real show. Now, in my experience,
that sort of crossover is atypical and it's super valuable and it
speaks to why I'm thrilled to have him as my fractionally available
co-host and as our first guest. So how do we get this started?
Luke, do you think we could do one of those like fancy produced
intros with music and stuff?
Luke (01:11):
Yes. The budget did allow for a fancy produced intro.
Rob Collie (01:14):
Oh yeah?
Luke (01:15):
Yep.
Rob Collie (01:16):
Well, let's do it then.
Announcer (01:18):
Ladies, may I have your attention please?
Announcer (01:22):
This is the Raw Data by P3 podcast, with your host, Rob Collie, and
your co-host, Thomas LaRock. Find out what the experts at P3 can do
for your business. Go to powerpivotpro.com. Raw Data by P3 is data
with the human element.
Rob Collie (01:40):
Welcome to Raw Data. I'm your host, Rob Collie, CEO and founder of
P3, powerpivotpro.com. And Tom.
Thomas LaRock (01:50):
Hi, Thomas LaRock here. I am a head geek at SolarWinds.
Rob Collie (01:56):
Thomas. He's Thomas LaRock.
Thomas LaRock (01:58):
Thomas.
Rob Collie (01:59):
Yeah. And who else do we have here with us?
Luke (02:01):
My name is Luke. I'm talk radio guy in South Florida. And I'm the
guy that knows nothing about data.
Rob Collie (02:09):
So welcome to the first ever edition of Raw Data. I'm really
excited by the crew we've got here. So let's jump right in. Tom,
we've... Or should I call you Thomas?
Thomas LaRock (02:23):
You call me Tom. I usually introduce myself as Thomas just that
people don't think it's like John or Dom or something like
that.
Rob Collie (02:30):
I see. Thomas is easier to parse. I get it. But I just wonder if
you were one of those guys that changed his name at some point,
like you grew up. Because I've always known you as Tom. But Thomas.
So we've known each other now for... We had our 10 year anniversary
recently, didn't we?
Thomas LaRock (02:49):
Yes, we did. It was 10 years ago, this past June.
Rob Collie (02:53):
Yeah. It's crazy. It was like it was just yesterday. So we have
interesting complimentary backgrounds, you and I, and that's why I
think I was really excited to do this with you. What's your handle
on Twitter?
Thomas LaRock (03:07):
SQL Rockstar.
Rob Collie (03:08):
SQL rockstar. Now, Luke here, in his day job, he actually
interviews real rock stars. He just had Sammy Hagar.
Thomas LaRock (03:21):
Oh, I thought you said rock stars.
Rob Collie (03:22):
Oh, cold. All right. Well, he had Jason Newsted. How's that?
Thomas LaRock (03:30):
I don't even know who that is.
Rob Collie (03:34):
I'm starting to regret my selection of cohost. He was the basis for
Metallica for a while.
Thomas LaRock (03:42):
Oh, yeah. Okay.
Rob Collie (03:44):
Not anymore. Luke is really kind of slumming it with us. He's gone
from real rock stars to SQL rock stars.
Thomas LaRock (03:53):
Wow.
Rob Collie (03:54):
I know, I know. I mean, it's not like I'm any better. I'm not Sam
Hagar. Anyway, in a previous conversation, I said, hey, well, I
come from sort of like the analytics and business intelligence side
of the world. And then I said that your background originated more
on the storage side and you kind of recoiled just a little bit. It
didn't seem like it was right to you. And then you thought about
it. So what does storage mean to you? Is it the right word? Am I
using the right word?
Thomas LaRock (04:32):
Well, I think you are using the right word. I just never thought of
it in that manner. To me, when I hear storage, I think of the guys
in charge of the SANs or racking servers and things of that nature.
The storage admin, there's doing storage. I was a database
administrator. I never really thought about it.
Thomas LaRock (04:53):
But in a way it is storage because it's the storing of the data. So
the data has to go into an engine and then to disk and then from
disk back through the engine and back to the client. So yeah, you
could think of it as storage. I usually think of it or I usually
tell people the focus was on the internals of the database engine
itself. In this case it would be Microsoft SQL server.
Rob Collie (05:18):
Yeah. So a lot of the history of the analytics industry and the
business intelligence industry is we're still, I think, in the
middle of a multi decade hangover of the influence of the storage
industry on the way that a lot of analytics were, even from a
software industry perspective. By necessity, there have been so
many storage professionals. Storage and retrieval.
Rob Collie (05:51):
If you can't store data and recall it, you can't run a business.
You can't even execute a transaction. Whether you're doing any
reporting or analytics or not, storage is table stakes. And so when
we met, I had recently joined Twitter within the last six months
before we met, we met in what? It was like May of 2010, somewhere
in there anyway. We just celebrated our 10 anniversary, I should
know this. But I forget. I'm that guy.
Thomas LaRock (06:23):
You forgot our anniversary?
Luke (06:26):
Shame.
Rob Collie (06:27):
Yeah, I know. I know. And so I had joined like the data channels on
Twitter. I had done the right thing. I had gone and I joined the
data channels and I quickly discovered that almost everyone on
those channels were storage professionals. They were primarily
database administrators, DBAs. So when I walked up to you and I got
introduced at that live tweet event that was, I don't know, it was
kind of a funny thing that people used to do.
Thomas LaRock (06:53):
Remember we used to meet people and go places?
Rob Collie (06:57):
Yeah. That was radical stuff. But do you remember the first
question I had for you?
Thomas LaRock (07:02):
I do. I do. It was essentially the clean version is, why are DBAs
so miserable?
Rob Collie (07:13):
It's quite an opener.
Thomas LaRock (07:15):
It is quite an opener. I was stunned because I remember looking at
and going, first of all, who the hell is this guy? And secondly,
how does he know us so well?
Rob Collie (07:27):
The power of Twitter, man.
Thomas LaRock (07:30):
Yeah, you've been stalking us. And clearly I had no defense. I
wasn't about to sit there and tell you, "Oh no, we're the happiest
bunch of people. What are you talking about?" I sort of looked at
you and I'm like, "I don't know why we're so miserable. I have some
ideas as to why we're so miserable." And I think we talked through
some of those ideas.
Thomas LaRock (07:52):
And if I recall, towards the end of that initial conversation, your
comment to me was basically we were just like the Excel community.
We had a lot of the same traits. Part of our misery was rooted in
working, not just with data, but with users of data.
Thomas LaRock (08:17):
And it was interesting to find the parallel between what I thought
was a unique group of individuals, this database administration
community, and all of a sudden the Excel community. I'm like, what
do we have in common? We actually have so much more in common than
I had ever realized. So you're question really opened up a brand
new perspective for me at that moment in time and going
forward.
Rob Collie (08:47):
Yeah. The common thread there, I think, is a community. Although
really the Excel people don't really have a community. They're a
demographic, if you will, but they don't really have a community in
the same way that the DBAs do. They make the world go round. The
world runs because of... And I know there's lots of people that
make the world run, but DBAs and Excel people, people who are good
at Excel, these are incredibly essential roles for the world that
no one really sees or appreciates what really goes into it.
Rob Collie (09:25):
And so when they interface with the rest of the business world,
they tend to be taken for granted, even though what they do is some
pretty arcane skills that are developed there. But yeah, the Excel
people don't have a water cooler in the same way. I have been more
recently monitoring things like the accounting sub Reddit. Okay.
Now here we go. Here's where the grumpiness is. Yeah. There's some
Excel grumpiness. It's the same kind of blowing off steam outlet as
what I was seeing on Twitter back 10 years ago.
Thomas LaRock (10:03):
Yeah. I've often referred to just the internet in general as a
cesspool of misery. But then you get into that dark corner called
Reddit and you're going to just find I think a lot of people
more... Maybe it's the anonymity. You don't have to really use your
real name and you can just sort of vent.
Thomas LaRock (10:24):
And I think for some people, Reddit is just a place where they can
vent. But for an outsider like me, I'm not really active in Reddit.
I can go there and I'm like, "Wow, these people are really
miserable." No. Actually, they just need to vent.
Rob Collie (10:37):
Maybe. Although honestly I find Reddit, and again, I curate what I
consume from Reddit rather than just like taking the default feed,
but I find it to be the most intelligent and civil corner of the
internet. But again, it's probably because I've tuned it.
Thomas LaRock (10:55):
Oh yeah. Absolutely. There are corners of Reddit where people are
civil. Absolutely. And then there are some horrible, horrible
places. But for all that, nothing's as bad as YouTube comments.
Rob Collie (11:11):
That's what Joe Rogan says.
Thomas LaRock (11:12):
That is the worst, worst thing.
Rob Collie (11:16):
Yeah. I don't think I've ever really gone down that rabbit hole, so
I'm going to probably stay away. Circling back, where you and I
sort of found I think almost immediate common ground was in the
notion that I had come originally from the Excel community. That's
what I worked on at Microsoft for a very long time, was worked on
Excel before I got involved in the business intelligence side of
software.
Rob Collie (11:44):
And now for the last 10 years been running a company in that space,
a consulting company. Where these two worlds meet is where an extra
kind of value is created from data. So there's the primary usage of
data, which is running the transaction. Someone buys something, for
example, you've got to record the transaction. You've got to
process it. You've got to make sure that they paid, all that kind
of stuff.
Rob Collie (12:11):
Obviously that's primary usage of data in business is to make the
actual transactions operate. But then there's this secondary value
of data and the secondary value is mining it, if you will, for
insights about your business to improve and optimize. And that's
where I've been. For a while out there, I was a SQL server MVP.
Microsoft had knighted me as a SQL server MVP and I don't know SQL
at all because the BI stuff was looped in with it.
Rob Collie (12:44):
But one of the things I found super, super, super compelling about
you over the years, Tom, is that you don't view where you came from
as the only thing. You're evolving. And you've been very, very open
and enthusiastic about the world of analytics, BI, whatever you
want to call it. Whereas not everyone in your original community,
your community of origin, where I met you, not everyone in your
community that you came from is like that. And you are exceptional
in this regard. You're not the only one.
Thomas LaRock (13:18):
Right. I don't think I'm exceptional, but you are absolutely right.
There are lots of people that would have you understand that let's
say you were at an event, a large three-day conference, and it was
SQL server focus event. Then everything should be core engine, deep
dive, 500 level sessions.
Thomas LaRock (13:39):
And wait, what's this thing? Business intelligence? We don't need
any of those sessions here. Go somewhere else. And those people
absolutely exist. They still exist today. There's fewer I would say
today, but they're out there. They used to be a lot more.
Rob Collie (13:54):
Where'd they go? If they're not there anymore, where did they
go?
Thomas LaRock (13:59):
I think they've started just disengaging with the community as a
whole. Because the middle ground, the middle class, they have kind
of embraced a little bit more of the analytics space because it's
everywhere now. It's prevalent all over. You can't get away from
it. I think some of those more extreme people just don't find
comfort in being around or say they just don't feel that's the
group for them anymore.
Thomas LaRock (14:25):
They want to be with people that are really just core engine.
That's it. So we certainly had that, like when you and I first met,
those people were out there. I wouldn't say I was one of them, but
I would say I was kind of stuck in my own little silo. I had my own
blinders on for my own reasons. It was just stuff that I was
working with and I wasn't really exposed to as much. But over time
I was exposed to it.
Thomas LaRock (14:51):
I have a background in mathematics. And so for a lot of it, it was
kind of interesting and familiar to me. And of course, a few years
after that, that's when data science started becoming an actual
term, which was interesting again. I see a lot of these people
saying, "Oh, well..." I'd also worked on my Six Sigma
certifications at the time. I think I got green belt at the time.
And that was a lot of stats.
Thomas LaRock (15:22):
And these people were just mesmerized by being able to understand
what a standard deviation was and how to apply it and how to use
it. It was the application of these tools in order to get insights
from your data. And I liked it. So I continue to kind of try to
absorb a little bit of that. Meanwhile, I still have one leg in,
hey, what's happening inside the engine? Somebody's going to come
to me and say, "My query is slow. I need to make it faster." And I
want to be able to help them too.
Thomas LaRock (15:54):
But I also want people to come to me and say, "Hey, sales are down.
What can we do?" "Oh, well, what data do we have?" "I don't know."
"Let me help you sort this out and figure out if I can find any
value for you." So yeah, it was an interesting time, I think,
around 2010. And I think that's when BI really started getting more
mainstream. There was a lot of work by Microsoft for reporting
services.
Thomas LaRock (16:21):
Of course, when Power BI came out, I want to say it was about five
years ago now. So there's a lot of work by Microsoft to help turn
the corner. I mean, they even changed the name of, I'm no longer a
SQL server MVP. I'm now a data platform MVP. So even the wording
and everything about it has kind of changed and been a little bit
more welcoming is what I would say. So these days I think most
people are very comfortable with the idea that they might have to
have one foot in the analytics space as well inside the database
engine.
Rob Collie (16:55):
Yeah. It seems like such obvious, low hanging fruit, if you're
already up to your eyeballs in the data platform in various ways.
Why not even just from a career standpoint go and pick up that
secondary value? You're like nine tenths of the away there maybe.
The challenge though, of course, is always the human element. The
more someone identifies as an IT professional, typically the less
business interested they happen to be. And this is a very, very
broad brush.
Rob Collie (17:28):
So people are listening to this right now going, "Wait, I'm in IT
and I'm obsessed with business." Well, yes. And that's great. And
that's actually a rising trend, this idea of the hybrid. I'm
running into all kinds of people these days with job titles that so
clearly scream one foot in IT and one foot in the business. And
that's the way of the future, I think.
Rob Collie (17:51):
But there's still a pretty strong center of mass there for if you
think of yourself as IT, you're focused on the tech and not
necessarily even as interested in the business problem. And that's
the human component of it. I think that if you view BI and
analytics as just another part of the stack, just another part of
the technology toolkit, well, that's where we've come from. The
entire BI industry has always been like that.
Rob Collie (18:22):
And spoiler alert, it's never worked. That mindset has never once
worked. It's this hybrid mindset and the tools that enable it that
are really changing things right now. Boy, I've really buried the
lead here. Here's the question I have for you, and it's a two
parter.
Thomas LaRock (18:41):
Oh, you didn't say there'd be a quiz.
Rob Collie (18:43):
Yeah. You can't be wrong about this. I'm going to ask you both
parts at the same time so you have an opportunity to contemplate
both answers simultaneously. So you're SQL Rockstar on Twitter.
That's your brand. That's in many ways synonymous with you. And you
know how rebranding is. Rebranding is very difficult. So when did
you first get into storage? When did you first get into SQL
server?
Thomas LaRock (19:08):
Oh, early two thousands. Let's just put a stick and say 2003 ish. I
was programmer/developer before that and using Sybase and Oracle
and SQL server. But around 2003 was when I started doing more the
database administration role.
Rob Collie (19:27):
Okay. So let's set 2003 as just semi arbitrary milestone and say
you could go back to 2003 and tell your 2003 self, "Hey, self, when
you get around to branding yourself, here's what you should call
yourself." Would it still be SQL Rockstar? That's question number
one.
Thomas LaRock (19:51):
Oh, I'm sorry. Do you want me to wait? Okay. I'll wait.
Rob Collie (19:54):
Well, I was hoping for a little bit more than a one word answer as
well, but I'm sure. You're as long-winded as I am, so we're a good
pairing. The second question is if you could instantaneously
rebrand your Twitter, for example, today, you could pick another
handle today and have it be retroactively what you always had been,
would it be the same as your answer for 2003? If you could just
pivot today with no switching cost.
Thomas LaRock (20:21):
Here's something that I guess you didn't know about me. I've
already changed my Twitter handle once.
Rob Collie (20:26):
Oh, I did remember that. Yeah. You were just Thomas the Rock for a
while, weren't you?
Thomas LaRock (20:30):
No.
Rob Collie (20:30):
What?
Thomas LaRock (20:31):
I was SQL Batman.
Rob Collie (20:33):
No, you weren't.
Thomas LaRock (20:34):
I was SQL Batman. And I was SQL Batman because in my role, that's
basically what you are. You're Batman. Something goes wrong, they
call you and you come in and you're superhero and you have to fix
it. You're just like, I'm Batman. And I had that for maybe almost a
full year of being on Twitter at first. I had stickers that said
SQL Batman with the bat logo. I had all this stuff going on. That's
what I was originally doing.
Thomas LaRock (21:05):
And I ran into some issues with licensing, if you can imagine. If I
wanted to get stickers printed up and use a service, they'd be
like, "We're not doing this. You can't use that." And I'm like,
"Come on. I'm harmless." But no, there's a whole thing about
protecting trademark and copyrights. So I got tired of trying to
utilize the SQL Batman. I even had sqlbatman.com.
Thomas LaRock (21:38):
I just made change and I said, you know what? I'm going to change
my blog to be thomaslarock.com. I'll just use my name for that. But
for Twitter, I'll use the handle SQL Rockstar. And the reason I
chose rockstar, my last name is LaRock. I've had the nickname the
Moniker Rockstar since I was about 16. My friends in high school,
like LaRock, rockstar. It's just the way it was.
Thomas LaRock (22:03):
So I decided I would just call myself rockstar. But at the time the
rockstar movie was out and that was yet another problem. So I just
said put SQL in front of it and then it's what I own and I'll move
forward with things. And if I could go back, I'd probably tell
myself, "No, do data rockstar instead." But honestly, these days I
look and I go, "Just use your name."
Thomas LaRock (22:25):
There's no reason to have the thing at the time. It was what pretty
much most of the cool kids were doing on Twitter at the time. If
you were inside of the database community, you were using SQL in
front of everything and being cute. And I did it and I've never
bothered to change it. So yeah, if I could go back in time, I would
tell myself have more of a different focus. If I wanted to use a
cute Moniker, it'd probably be Data Rockstar, Data Pro, or
something like that.
Thomas LaRock (22:58):
If you notice, actually for a while, I didn't use my real name. It
was at SQL Rockstar and the name was also SQL Rockstar. I changed a
few years back to put my real name in there. So you can see Thomas
LaRock and at SQL rockstar. That was kind of my compromise for
myself instead of just changing my handle, which I don't think I
could do, because once I got verified, you can't touch things.
Otherwise they take away your check mark. You don't want to lose
the check mark. The check mark makes me legit.
Rob Collie (23:25):
Oh yeah. That's right. That's why I invited you. It's the check
mark. We needed more check mark. Infinite percentage increase in
check marks on the show. So yeah, data rockstar. That would've been
good. I could get behind that. Here's a hypothesis of mine. That's
more than a hypothesis. This is an opinion of mine. It's been
really interesting. I've watched this evolve over probably almost a
full 20 years I've been watching this story and it's basically that
storage changes all the time, but analysis is...
Rob Collie (24:00):
Actually, there's all kinds of technological improvements that
allow us to do things fast or do things better, lower friction, et
cetera. There's been a lot of big changes on that front. That's
really like the reason why my company even exists is because of how
much change is happening and already has happened in that space.
But at Microsoft's in the early two thousands, about the same time
that you were getting into SQL server for the first time, Microsoft
was really struggling with this, starting to wake up to the idea
that data might not be just stored in tables.
Rob Collie (24:39):
That data might not always be table shaped. And this was really
causing almost like an existential crisis in the data world at
Microsoft. And it's really funny. I got to read white papers
written by like the architects that even at the time were being
paid like $3 million a year. These really high [inaudible 00:25:00]
flu white papers that sounded really smart. And from what I
remember them now, they have no whole bearing on where the world
actually went.
Rob Collie (25:11):
They were just completely off. Basically the answer was, oh, well,
we'll just make it so that we can also store XML blobs in SQL
server and that'll take care of it. I mean, there are all kinds of
funny things. There are so many funny things to reflect on. But
while the storage half of Microsoft was freaking out about this and
while often in the shadows things like Hadoop were being born that
was the real answer to this crisis. Not XML of blobs in SQL.
Rob Collie (25:49):
The analysis world was also panicking about it at Microsoft.
Something really fundamental about the way that we work was
potentially at risk. And there was another architect at Microsoft
who had decided to kind of crash the Excel team for about a year.
He just needed a place to land. This guy had been around forever.
He came to the Excel team to tell us that the Excel grid, rows and
columns, that was old fashioned, that was outdated, and that was
going to go away. We needed all of that sort of jagged,
heterogeneous content, where like, how do you store a webpage?
Rob Collie (26:32):
That the content of a webpage doesn't fit into a row oriented
storage that well, does it? And each web page is different. Every
different piece of information might have different columns if you
were going to try to store it in a column oriented way. And he was
convinced that that same phenomenon was coming for Excel. That
every row of data in Excel might have a different column set than
the previous row and Excels whole formula language, everything
needed to be redone to accommodate this.
Rob Collie (27:11):
Now, of course, we now have the benefit of hindsight, 15, 17 years
later, this hasn't happened and no one's dying for it either. But
remember, this is like a main man at Microsoft. This is someone
that gates himself. He was almost like a lieutenant of him. And at
that point in my career, I'd already finally learned enough to know
when someone was just like totally off the rails. And he was so
much more senior than I. The only card I had to play against him
was just to repeatedly say like Tom Hanks in big, to say, "I don't
get it," just over and over and over again.
Rob Collie (27:55):
Rows and columns. We've had rows and columns in Excel forever.
Everyone's played battleship. They know how to line up a row and a
column and give it a coordinate. Your thing. I don't get it.
Thomas LaRock (28:07):
It's a great strategy.
Rob Collie (28:11):
I also knew that it would work because my superiors had made it
very clear to me that we weren't going to do stupid, like crazy
computer science things with the Excel product. We had more
responsible things to do. But none of them really wanted to go toe
to toe with this guy. So they kind of put me out there. But I knew
who was writing my review.
Rob Collie (28:34):
And this guy would go around and tell everybody behind my back
that, "Whenever I bring this up with Rob, that guy, that guy, he's
just done. He's just done." He was so obnoxious. He's basically
telling everyone that I was too stupid to understand what he was
going for. But again, even him now, all these years later, he would
probably admit that analysis is rows.
Rob Collie (29:00):
The only things that you analyze are the things that are in common
between like if you've got like 15 rows of data or 5 million rows
of data, and some of them don't have certain columns, well, you
wouldn't be including those in your analysis unless you had common
attributes in each row of data that was interesting to analyze.
Otherwise that row wouldn't even be involved.
Rob Collie (29:29):
So the way I've been boiling this down for people lately is that...
And this architect, he described this odd heterogeneous storage, he
described it as curly data. And I liked that. I like that idea,
curly data. And like you need to go store the internet for a search
engine, damn straight that's curly data. That is not nice, clean
tables.
Rob Collie (29:55):
But when it's analysis time, analysis you're always pulling
rectangles. You're always extracting rectangular table shaped row
sets in order to perform the analysis. That's separate from the
storage. So the query engines that have been built over time that
allow you to retrieve data from things like Hadoop. Well, how many
sequel-like interfaces have now been built to pull regular shapes
out of those sources?
Rob Collie (30:32):
So my world of analysis has, at least until now, been very well
insulated from the storage revolution in terms of, what do you want
to call it? Curly data. The curly data storage revolution. And the
same way that analysis wasn't disrupted terribly much by the
transition from tape reels to hard drives, the fundamentals of what
you were doing, the technology was different, but the fundamentals
of what you were doing were not rewritten just because we started
storing things differently. That was a monologue. That's one of my
things. What's your reaction? I've never told you that story
before. I don't think so.
Thomas LaRock (31:14):
No, I don't. And my first reaction is, is that person still at
Microsoft?
Rob Collie (31:19):
No.
Thomas LaRock (31:19):
And I need to know right now. I was going to say later you're going
to tell me who that is.
Rob Collie (31:26):
Before we move on then, I won't tell you who it is, but A, he left
Microsoft in a huff shortly after that, after he did not get his
way on that. Right. I kind of get to almost like paint a silhouette
of him on my airplane. And he famously when he told Ballmer he was
leaving, Ballmer threw a chair across the room. So now you know
everything you need to know to look up who this was.
Thomas LaRock (31:53):
Threw a chair across the room, because Ballmer wanted him to stay.
Huh?
Rob Collie (31:58):
Yeah. And he went to Google and all of us on the Excel team were
just sitting chuckling like, "Eat it up. Yeah, you should take
him."
Thomas LaRock (32:11):
Well, that explains a lot about Google Sheets. Okay. So here's the
thing, when you were just describing to me about the curly data and
you're talking about the analytics and you got to the point that
was in the back of my mind as you were speaking, which is that you
say row, I'll say it's... What's the fancy word? Observation. That
row, the observation of a data event, you may not have information
for all those columns or attributes.
Thomas LaRock (32:43):
And that's totally normal to me right now. I'm like, yeah, I get
it. One of the things I say is nobody goes to school to become a
data janitor. Hmm. I didn't. There was no course. And I think your
response to that was here we are. This is what we do. We are the
data janitors of the world, whether you're Excel or you're a DBA,
this was the common ground we had.
Thomas LaRock (33:12):
I didn't know it 10 years ago. It took me a while to get up to it.
But why are we miserable? Because we're data janitors all day. This
is what we do. And why don't we have the observations for all this?
Are you kidding me? I don't know. A sensor went down. Oh, okay. Or
we just didn't think to ask that question. And so it's not included
in 10,000 survey results. We didn't think that question was
worthwhile. It's like, but there was data, and I had this whole
model built and it needed that.
Thomas LaRock (33:42):
Now what am I supposed to do with these 10,000, 20,000 records? It
can be very frustrating. I can see what the man was trying to
describe, but he really wasn't able to articulate what he thought
was coming. And more importantly, he didn't really understand the
tool. He thought the tool had to change, but the reality is the
tool itself didn't have to change.
Thomas LaRock (34:12):
It was the application of the tool was going to become different.
And he couldn't see that even at the time. I mean, Python was a
thing. You could have done so much more. There's a bigger world out
there than just the Microsoft data platform, as great as it is. And
I love it.
Rob Collie (34:30):
Come on, come on.
Thomas LaRock (34:30):
It's true. But there's still stuff out there, stuff out there. But
yeah, that was kind of my thought was this guy was not a data
janitor and he wanted the tool to do this specific thing. So you
guys were going to have to go and reinvent it, which would've been
a huge waste of time. Whoever was really in charge there for you
guys, thank God they knew not to try to shift gears.
Rob Collie (34:55):
Yeah. I completely agree. The thing that he was missing, and it's
not like I knew it then either, if I'd known it, I would've told
him this storage revolution that they saw coming is decoupled from
analysis. Again, up until this point, you never know what's around
the corner. But up until this point, analysis has insulated. It's
kind of like I just need to know...
Rob Collie (35:25):
Another way to say it is that I don't actually know truly deep down
how a SQL database is structured. I don't need to. I think of the
table that I pull from it, which is oftentimes a view written by a
friendly person, such as yourself. That view is reconstituting my
rectangle of data that I need from all kinds of other tables that I
don't necessarily see.
Rob Collie (35:53):
I don't care. It's beautiful. I don't need to. And so if the view,
the rectangle that I'm getting happens to be stored out there on
many different hard drives and in a hive farm or something and in
curly format, but I get a rectangle back, my job doesn't really
change. Take the analysis hat off for a moment. What are your
observations of this?
Rob Collie (36:22):
When I call it like a revolution in storage, is it really? How much
has the curly storage model, Data Lakes, Hadoop, all that kind of
stuff. How much is that... I don't know. I was going to use the
word invaded to sound dramatic. How much has that stuff kind of
invaded your world?
Thomas LaRock (36:43):
Well, in terms of say the Microsoft data platform, it was years ago
when they introduced the concept of PolyBase. So PolyBase is just a
simpler way for you to link to almost any other data structure and
to pull the data into SQL server. And they're trying to make it
very easy to connect basically from their data platform and extend
into any other platform in order to get the data into one place and
then build your rectangle for you.
Thomas LaRock (37:17):
So it's there and it can comes up every now and then and somebody
says I've built this. It's not working as well or things of that
nature. So it's definitely part of the ecosystem these days. And
the latest one, what is it called? Big data clusters that Microsoft
just rolled out. They're making efforts to build into their
ecosystem something that is equivalent in other ecosystems.
Thomas LaRock (37:46):
So if you are a Microsoft customer and you need certain
functionality inside that data platform, it actually exists
somewhere. It's a framework, it's a tinker set. All the pieces are
there. You might have to build something more or less than other
things. But a lot of that functionality is really there, especially
in Azure. There's just so much these days.
Rob Collie (38:09):
Azure. This is not SQL server. Now, there is SQL Azure.
Thomas LaRock (38:19):
No. There's Azure SQL database. Microsoft marketing would not want
to hear you say SQL Azure.
Rob Collie (38:24):
Well, if they're listening, I'll call it a win. If we reach the
point where we can upset people with the way we describe things by
using the... I'll start calling Power Pivot, I'll start calling it
Power BI in Excel. It's just really the only rational name for
it.
Thomas LaRock (38:44):
It used to be called SQL Azure and I love that name.
Rob Collie (38:49):
Look how dated I am. It makes sense why they call it data platform,
because there's just so many things in there now. And so many of
them, as you were hinting at, are clones in a way, improved clones
in many cases of things that we see on the Linux platform. If you
go look at AWS, so much of AWS, the services available there, it's
what I call the Linux cool kids stack.
Rob Collie (39:21):
If you're launching a startup in Silicon Valley, you're issued your
MacBook and here's your AWS subscription. These are like the
starter kit. Microsoft licensed, what is it HDInsight? That's
basically like a Linux distribution. And so there's a lot of
literal Linux services available on Azure.
Rob Collie (39:46):
And at the same time, you also see these more windows based
services in the Azure platform and you start almost like lining
them up. You start saying, "Oh, this one's kind of like that one
from over in the Linux stack," but it's Microsoft taking a look at
it going, "Oh, we can do better."
Rob Collie (40:02):
And so it's a really interesting ecosystem going on over there. Let
me put you on the spot here. Have you done any technical hands-on
work with, I wonder what we call it, modern storage, the curly
storage, or have you've been in sort of the chief geek role long
enough that you haven't gotten your hands dirty with that?
Thomas LaRock (40:23):
So it's head geek.
Rob Collie (40:24):
Head geek. I'm so sorry.
Thomas LaRock (40:27):
I do joke that I haven't had a real job in a long time. I'm very
far removed from my production DBA days. However, in my role as
head geek, I get my hands on the things, but not for production
purposes. It's more for I've got to learn to understand what these
things are doing, how certain things work, because I need to be
able to explain some stuff to others.
Thomas LaRock (40:49):
But what I have done, and it's been a few years, Microsoft
partnered with edX and they put together some certification
programs. So you would take like 10 classes online through edX and
they would align with a certification. I got a certification in,
let me think now, well, one was in big data, one was in machine
learning, I think, and another one in artificial intelligence. So
have I put my hands on the curly data? I'd say yes.
Thomas LaRock (41:24):
But those being Microsoft focused programs, it was touching a lot
of areas of Azure. So did I have to go into Azure data factory,
consume some data, transform it, write some use SQL to pull some
insights out of it? Yeah, I had to do all those things. It's been a
while. If I had to do it again, I could probably go back and figure
it all out again.
Thomas LaRock (41:47):
But once I did it for the program, there was really no need for me
to touch it again. Lately what I have been doing is I've been
spending a lot of time learning Python. Sometimes people say, "What
should I learn, Python or R?" And I kind of view it as two
different things. I think R is very much focused on being a tool
for data scientists. And I think if you're a data scientists, you
want to use it. That's great.
Thomas LaRock (42:12):
I think Python is a little more extendable. It can do all the same
data science things that R can do, but it can also do some other
things. So that's why I chose to dive into Python and I've been
spending a lot of time on it. And then there's this little website
called Kaggle. Have you ever heard of Kaggle?
Rob Collie (42:27):
I have.
Thomas LaRock (42:28):
Yes. So I've started doing some learning and competitions in
Kaggle. And again, focused on using Python, but I can also go use
other things. If I need to drag some data into Excel to be a data
janitor for a little bit, then, yeah, I can do that. So there's a
bit of an ecosystem than a say a toolkit that I built up for myself
now. And that's where I've kind of been spending some time and
getting my hands on that curly data.
Rob Collie (42:58):
I'm all kinds of angry now.
Thomas LaRock (42:59):
Why is that?
Rob Collie (43:00):
I've got a couple of things to straighten out. First of all, in the
answer to the question of, should I learn R or should I learn
Python? The answer to that question is nine times out of 10,
DAX.
Thomas LaRock (43:14):
All right. You're wrong, but that's okay.
Rob Collie (43:19):
Come on. There's a lot of trendiness in it. Now, there's still a
tremendous usage of it. I'm not saying that learning Python is a
bad thing. I think it's actually a really good thing. It's so often
people's actual needs are better served by something that might not
have that same kind of cool kids edginess to it.
Thomas LaRock (43:39):
Yeah. I wouldn't want to do a lot of... A lot of times I see Python
being used as all these examples. Some of the ways they're
manipulating data, to me, I'm not sure I would really want to do it
that way. I would want to use a different type of tool like Excel
or Power BI or something, because I'm a little more comfortable
with that than what these lines of code are doing.
Thomas LaRock (44:01):
But if I want to build a model in machine learning, I could use
Azure ML Studio. But under the hood, it's kind of just running the
same code I could just do for myself. So I don't know. It's
either/or, but I just feel that at the end of the day, Python just
has a little bit more.
Rob Collie (44:18):
Yeah. I mean, it's just so often a lot of Python will be written to
draw a chart.
Thomas LaRock (44:26):
Yeah, exactly. Oh no, you're right.
Rob Collie (44:29):
Or to do a very fundamental aggregation that would've been so much
more powerful and flexible if you built a DAX data model around it.
I even go to developer conferences on occasion now and the whole
goal is to say, "Hey, look, you know so many things that I don't
know, you're so much more technical than I, and yet I'm going to do
some things up here on stage that you can't do, really important
everyday things that you can't do and I want you to be upset about
it."
Rob Collie (45:06):
Because I'm really just not that technical. I'm the least technical
person at the company. Everyone we hire is so much better even at
the things that I am good at. They're so much better at those than
I am. A lot of things we're talking about like in the Azure
platform, for instance, we have people who are very good at those
things. I've never seen them. I haven't gotten the certification
that I could even forget, like what you were talking about.
Rob Collie (45:34):
And then you said, "If I need to drag some data into Excel and be a
data janitor." Come on now. Modern Excel that has the DAX engine
and the power query engine in it. We've escaped that. We've escaped
the janitor hood as long as we work in an organization that
understands what we can do, which, again, that human factor. Most
companies are very, very, very slow to wake up to the fact that
their resident Excel guru has now become a completely new
species.
Rob Collie (46:03):
The person who discovers what I call modern Excel, which is really
the Power BI engines, the under-the-hood engines baked into Excel,
when they discover that or they discover Power BI itself, they feel
like they're the first person to discover fire. They sit back at
their desk and go, "Oh my God." And they say things like the
equivalent of, "Did you see that?" And everyone looks at them like,
"No, we didn't see anything. In fact, maybe you should get back to
work." It's a very unsatisfying.
Rob Collie (46:31):
And then some period of time later, those people end up working for
us. That's where our employees are made, is in those trenches. I
was mostly just joking. You just equated Excel and data janitor
hood so glibly that I had to circle back. I had to say
something.
Thomas LaRock (46:52):
I think it Excel is the tool of choice for most data janitors. We
should make a commercial.
Rob Collie (46:58):
That's true. That's true. We've experimented with some advertising
like on Facebook. It's not running at the moment. The ad says, "Are
you running a spreadsheet sweatshop?"
Thomas LaRock (47:11):
Yes. I think I've seen that.
Rob Collie (47:13):
Yeah. We have these people sitting in what looks like a bombed out
factory, but there's all these spreadsheets on these monitors and
everything. The reason I don't like the janitor term is because it
sticks to the person more than it sticks to the org. That's why the
spreadsheet sweat shop, I prefer that nomenclature. That's not the
preferred nomenclature, dude.
Thomas LaRock (47:40):
So I totally get how you have that apprehension about using the
data janitor term. But I want you to know that in those courses I
was doing, to earn that certification, one, or actually more than
one, was taught by a friend of yours, Wayne Winston.
Rob Collie (47:56):
Oh, The Wayne.
Thomas LaRock (47:57):
And Wayne opened my eyes to how to use Excel in so many wonderful
ways with descriptive statistics. And that's the type of stuff I'm
talking about. I'm talking about, hey, I have these columns. How
many are missing values? How many are no? Stuff of that nature that
a lot of people would use Python for, but for me, I might just use
Excel for that from time to time.
Rob Collie (48:19):
Yeah. Good old rectangles.
Thomas LaRock (48:21):
Wayne was so good. Such great courses.
Rob Collie (48:24):
Was it live?
Thomas LaRock (48:26):
No, it wasn't live. It was recorded.
Rob Collie (48:30):
I tell you, a live course with Wayne would be another experience
all together. He is such a character. I bet they had to edit him
down to 20% of what... He used to visit Microsoft and believe it or
not, teach classes to Microsoft's finance departments.
Thomas LaRock (48:54):
I believe that.
Rob Collie (48:55):
But then he'd come hang out with the Excel team in the evening and
just like hold court. Oh man. It was like drinking from a fire
hose. It was awesome. He lives near me. I mean, I'm in central
Indiana now. I'm in Indianapolis and he's in Bloomington. I've been
here for five years and we still haven't gotten together. That's on
me. I'm probably not going to see him...
Thomas LaRock (49:19):
Can't get together now either.
Rob Collie (49:20):
Can't get together now either. Yeah.
Thomas LaRock (49:23):
So it's not all on you, like the last three months.
Rob Collie (49:25):
Oh yeah. I mean, I've got a good three or four month excuse now. I
mean, I did reach out. We were going to get together, but then you
know following up, that's the trick, isn't it? I think that's
probably a pretty good place to wrap episode one. What do you
think?
Thomas LaRock (49:38):
I think so. I think we battled long enough about nothing. Raw Data
is really a podcast about nothing.
Rob Collie (49:44):
Is that what we're going to do? It's a podcast without substance. I
look forward to doing more of these. We have not come close to
talking about everything. We've got lots of ground to cover.
Announcer (49:59):
Thanks for listening to the Raw Data by P3 podcast. Find out what
the experts at P3 can do for your business. Go to
powerpivotpro.com. Interested in becoming a guest on the show?
Email lukep@powerpivotpro.com. Have a data day!