SQL, Databases, and Basketball Stats
SQL, Databases, and Basketball Stats
I swear, I love all of these job postings until they say SQL is a requirement. I love collecting and analyzing data, but I just don't use databases.
Re: Houston Rockets Job Openings - Analyst and Intern
If you want to get into basketball stats as a vocation, I would certainly say it is pretty critical to learn.bbstats wrote:I swear, I love all of these job postings until they say SQL is a requirement. I love collecting and analyzing data, but I just don't use databases.
Re: Houston Rockets Job Openings - Analyst and Intern
But is it essential pre-requisite for every entry hire in a big analytics shop? Maybe it is, but in a big shop there might be room for a bit more variety of entry skills & aptitude.
How long would folks who know estimate it takes to become reasonably proficient at SQL? A hard study week, a month or 3 months? More? If one can pick it up in that fairly short timeframe it does not seem absolutely critical to have in advance if the hire is intended to be for the long run and there are others around doing the SQL already and one gets put on simple SQL tasks before taking on more involved ones.
How long would folks who know estimate it takes to become reasonably proficient at SQL? A hard study week, a month or 3 months? More? If one can pick it up in that fairly short timeframe it does not seem absolutely critical to have in advance if the hire is intended to be for the long run and there are others around doing the SQL already and one gets put on simple SQL tasks before taking on more involved ones.
Re: Houston Rockets Job Openings - Analyst and Intern
I learned SQL on the job in my current role in analytics at a financial services organization. It took a couple weeks to become fairly proficient. My previous training was in SAS. If you have previous programming experience, I imagine it is something you can pick pretty quickly.
Re: Houston Rockets Job Openings - Analyst and Intern
As far as I'm concerned using advanced databasing such as that available in SQL stifles creativity in model building. All this prepackaged software that makes certain things easy to do pushes you away from being genuinely inventive.
Re: Houston Rockets Job Openings - Analyst and Intern
You do creative work on smaller samples/by hand/R/Excel etc, then for permanent use of said model, you try to find a way to automate it with SQL. Does that sound right?v-zero wrote:As far as I'm concerned using advanced databasing such as that available in SQL stifles creativity in model building. All this prepackaged software that makes certain things easy to do pushes you away from being genuinely inventive.
I don't know SQL; I don't know how hard it would be to, say, automate the calculation of ASPM. But it would be nice to do so, rather than clicking "update" on an Excel file.
Re: Houston Rockets Job Openings - Analyst and Intern
I do my creative work on what I consider to be suitably recent game data (2002-the present) entirely in Python, using the whole dataset, running significance tests on additions to the work in R, and yes, if I like something, databasing it, though still in Python. I can quickly write and test code in Python that it would be next to impossible to do quickly in something as limiting as a standard databasing language.DSMok1 wrote:You do creative work on smaller samples/by hand/R/Excel etc, then for permanent use of said model, you try to find a way to automate it with SQL. Does that sound right?v-zero wrote:As far as I'm concerned using advanced databasing such as that available in SQL stifles creativity in model building. All this prepackaged software that makes certain things easy to do pushes you away from being genuinely inventive.
I don't know SQL; I don't know how hard it would be to, say, automate the calculation of ASPM. But it would be nice to do so, rather than clicking "update" on an Excel file.
Re: Houston Rockets Job Openings - Analyst and Intern
You can do your modeling outside of SQL and 'score' your data with whatever coefficients/projection factors, but once completed, loading that into a structure database where you can build SQL for ad-hoc/Business Intelligence type reporting for non-technical folks is powerful. It also provides data integration possibilities with other sources of data that a team may have.
Re: Houston Rockets Job Openings - Analyst and Intern
I looked through a number of SQL instructional books last night. I overcame my intimidation with the programming language and got an initial degree of comfort that I could learn to use it.
I did have some questions (overlapping or redundant). If any feel able to answer some of them from experience or better informed guessing I would appreciate the assistance.
1. My sense was that an experienced SQL expert would be 10-100+ times quicker than me (initially hunting thru guidbebooks and pecking out commands) but that I might be able to get reasonably comfortable and close the gap to only 2-5 times if immersed in its use and especially if I got to look over someone’s shoulder for 1-4+ weeks. I doubt I could ever equal the speed of an expert. But I would also guess that the time spent doing the programming is only 10% - 25% of the time that is spent on data entry and is / should be spent later analyzing the output. Is this roughly true?
2. Do the database systems of NBA teams tend to have hundreds or thousands of hours of programming into them at this point?
3. Is it closer to the mark to say that the system design is updated in the summer and then during the year almost all the effort is data entry, analysis and reporting or is the system expanding / significantly changing during the season?
4. Would you estimate that the majority or nearly all of the programming could be done by a staffer with 1-2 SQL courses and 1-2 years of experience or does a substantial portion of the work required very experienced or even expert staff?
5. What % of the work in SQL is truly too big and complicated to be done in Access?
6. Is it routine for NBA teams to use SQL or other programming tools to search out data patterns in play by play data?
7. Would it be more accurate to say that most NBA systems are comparable in sophistication to cutting edge major global companies,typical mid-size firms or small firms with basic systems?
8. How big are NBA databases? Hundreds of gigabytes, into terabytes and whatever the next step up is?
9. Are these systems on small, mid-size servers or top of the line stuff? Would any of these teams use / benefit from supercomputers?
I did have some questions (overlapping or redundant). If any feel able to answer some of them from experience or better informed guessing I would appreciate the assistance.
1. My sense was that an experienced SQL expert would be 10-100+ times quicker than me (initially hunting thru guidbebooks and pecking out commands) but that I might be able to get reasonably comfortable and close the gap to only 2-5 times if immersed in its use and especially if I got to look over someone’s shoulder for 1-4+ weeks. I doubt I could ever equal the speed of an expert. But I would also guess that the time spent doing the programming is only 10% - 25% of the time that is spent on data entry and is / should be spent later analyzing the output. Is this roughly true?
2. Do the database systems of NBA teams tend to have hundreds or thousands of hours of programming into them at this point?
3. Is it closer to the mark to say that the system design is updated in the summer and then during the year almost all the effort is data entry, analysis and reporting or is the system expanding / significantly changing during the season?
4. Would you estimate that the majority or nearly all of the programming could be done by a staffer with 1-2 SQL courses and 1-2 years of experience or does a substantial portion of the work required very experienced or even expert staff?
5. What % of the work in SQL is truly too big and complicated to be done in Access?
6. Is it routine for NBA teams to use SQL or other programming tools to search out data patterns in play by play data?
7. Would it be more accurate to say that most NBA systems are comparable in sophistication to cutting edge major global companies,typical mid-size firms or small firms with basic systems?
8. How big are NBA databases? Hundreds of gigabytes, into terabytes and whatever the next step up is?
9. Are these systems on small, mid-size servers or top of the line stuff? Would any of these teams use / benefit from supercomputers?
Re: Houston Rockets Job Openings - Analyst and Intern
FWIW, I think SQL is important to know...if the company that is hiring you uses SQL. No brainer, right? A company is not changing it's database infrastructure because someone doesn't like it.
Having said that, in my own time and with the work I did for CAC, I started using a NoSQL database (MongoDB) quite heavily in recent months. I find it pleasurable to use, but maybe that's mostly because the custom of TYPING IN CAPS LOCKS for SQL bothers the heck out of me (it's not required, btw).
For anyone who has good scripting experience (JavaScript, Python, Ruby, etc), it's worth taking a look at MongoDB. Oh, and especially if you are familiar with JSON. Also, a lot of companies are starting to either transition away from SQL into NoSQL (and Hadoop/MapReduce/BigTable) or adding it to their infrastructure.
The reason I think learning NoSQL right now might be a competitive advantage is that it's still relatively new, so you're on more equal ground with programmers out there who have been using SQL for decades.
One more piece of advice. Sign up for the (free) Coursera course on databases. It's really helpful. You'll learn a lot. It's mostly SQL and relational theory (which is important to understand), but there is also discussion of XML/JSON/NoSQL.
My two cents.
Having said that, in my own time and with the work I did for CAC, I started using a NoSQL database (MongoDB) quite heavily in recent months. I find it pleasurable to use, but maybe that's mostly because the custom of TYPING IN CAPS LOCKS for SQL bothers the heck out of me (it's not required, btw).
For anyone who has good scripting experience (JavaScript, Python, Ruby, etc), it's worth taking a look at MongoDB. Oh, and especially if you are familiar with JSON. Also, a lot of companies are starting to either transition away from SQL into NoSQL (and Hadoop/MapReduce/BigTable) or adding it to their infrastructure.
The reason I think learning NoSQL right now might be a competitive advantage is that it's still relatively new, so you're on more equal ground with programmers out there who have been using SQL for decades.
One more piece of advice. Sign up for the (free) Coursera course on databases. It's really helpful. You'll learn a lot. It's mostly SQL and relational theory (which is important to understand), but there is also discussion of XML/JSON/NoSQL.
My two cents.
Re: Houston Rockets Job Openings - Analyst and Intern
For a job I had to learn MSSQL and MYSQL, setting up databases in both, but there are not that different anyway. It is rather easy to learn, at least, if you have some experience in programming (C, Fortran, etc.) and using script languages (Javascript, Matlab, etc.). At home I just use XAMPP, which is an all-in-one solution, where you can setup a server and database, while using php to go through the data. Setting up a database is really simple and fast with phpmyadmin. I don't think that someone with some experience needs more than a week in order to have a server with database running.
For the basketball stuff I wrote a Fortran tool some "ages" ago and I still use it. The files I get out of that are easy to implement into a database.
For the basketball stuff I wrote a Fortran tool some "ages" ago and I still use it. The files I get out of that are easy to implement into a database.
Re: Houston Rockets Job Openings - Analyst and Intern
You can't become an expert DB architect overnight, but learning the basics of SQL should be easy. If it's not, then it's time to find another vocation. 

I am a basketball geek.
Re: Houston Rockets Job Openings - Analyst and Intern
Thanks for the mention of NoSQL. Saw this link. http://nosqltapes.com/ Pretty hard core advocacy.
Evan, does your understanding / view of basketball and basketball analytics influence you to have more interest in noSQL than simply as a standalone abstract or even stylistic product choice?
The course recommendation is a sound one. I might get to it later.
Evan, does your understanding / view of basketball and basketball analytics influence you to have more interest in noSQL than simply as a standalone abstract or even stylistic product choice?
The course recommendation is a sound one. I might get to it later.
Re: Houston Rockets Job Openings - Analyst and Intern
Good question. I'm actually thinking about this a lot right now. We really don't have to worry about a lot of the things that concerns SQL (transactions, locking, consistency), and by giving those up, you gain a lot of speed (or so I'm told). Mostly what I've been doing is grabbing data that meets certain criteria and I don't need sophisticated SQL-type operations. At the same time, I'm using it more programmatically, and maybe some of the things I'm doing could be done more easily in a SQL system.Crow wrote:
Evan, does your understanding / view of basketball and basketball analytics influence you to have more interest in noSQL than simply as a standalone abstract or even stylistic product choice?
One thing I really like about MongoDB is the lack of a rigid schema. It's good when you're designing something on-the-fly and haven't really finalized the design or figured out the most common use cases. For a business that does the same operations over and over, this isn't such an advantage, but for a hobbyist, it's nice to be able to go back and continually re-build and re-shape the schema.
At any rate, a lot of operations map from SQL to MongoDB. In fact, here's a convenient list:
http://www.mongodb.org/display/DOCS/SQL ... ping+Chart
Re: SQL, Databases, and Basketball Stats
Wat.
I posted this as a reply to the Rockets' job posting. How did this get here??
EDIT: And my reply to that original thread is gone. Was I dreaming? Is someone gaslighting me?!? *puts on tinfoil hat*
I posted this as a reply to the Rockets' job posting. How did this get here??
EDIT: And my reply to that original thread is gone. Was I dreaming? Is someone gaslighting me?!? *puts on tinfoil hat*