Always 70 and Sunny
Always 70 and Sunny

Dispelling Big Data Myths


Having finally dug out and recovered from what was essentially a month of vacations for #bigdata (insert mocking comment here), YHC took the opportunity this morning to listen to the substitute podcast episodes 97 and 97.75.  Considering I get questions on how #bigdata works on a weekly basis, I figured now was as good a time as any to clarify exactly how it all works.

  1. It all begins with the backblast. Someone writes a backblast with the Date, QIC and the PAX list.  They tag a region and add a tag for the AO.  Nothing fancy here as this always happened.  If you don’t tag an AO than it isn’t in the AOs history.  The OCD in me hates this.  I often go back and clean up posts that don’t tag an AO.  Please tag an AO.  Yes, it should be a required field.  Blame WordPress for not making that as simple as it should be.
  2. #bigdata gets a update from f3rva.com as soon as a backblast is published.  It then takes the information from the backblast and creates a new “workout” in #bigdata, pulling out the Q, each individual PAX member, the AO and stores all that information.  Don’t ask questions like “why can’t you just read directly from the PAX line to get real-time updates?”  People ask me that question a lot.  Buy me a 2-3 high-ABV craft beers and I’ll explain it to you.  Saab may be able to explain it to you now as well since he has a fancy new degree.
  3. To account for inconsistencies in the PAX / Q names, there is an alias concept.  Every person that exists in #bigdata can have 1 or more aliases associated with that individual.  At this point, the alias association is manual (backlog item).  No, there are no machine learning algorithms in place (yet) to automatically associate a likely alias with an actual PAX member.  This is why I give Lockjaw so much grief for coming up with alternative spellings for Two Can whenever he feels like it.  Chuckle all you want but at some point I will get tired of linking aliases and then someone else will have to do it (or it won’t get done).  That is why I respectfully ask that people make a reasonable attempt to spell the name correctly.  I take time once a month to go back and clean up everyone’s misspellings, abbreviations, 19th century british literature equivalents.  No, it’s not the best use of my time.
  4. In the early days of #bigdata, that is all that happened.  As Honeydo put it, once the backblast was published, #bigdata was set in stone.  I had to intervene and make sure it was refreshed when there were changes.  Not anymore.  The backblast author has 2 days to make changes to the PAX list and it will still get updated.  No more complaining to me or texts during the day asking to refresh #bigdata, not necessary.  Don’t question why it is limited to 2 days.  I made a decision.  If someone wants to work with me on triggers in WordPress then we can eliminate the 2 day restriction.  Otherwise, it’s 2 days.  I consider this case closed.
  5. Epoch for #bigdata is 2017-03-19.  Why?  Because that is when I finished tested and released #bigdata to TYA and a few others.  Why didn’t I go back and add every previous workout?  Because that would suck.  It would be a tedious process that I wouldn’t even send offshore (sorry Offshore).  Could this be automated?  Sure.  I just don’t think it’s the best use of my time.  If someone is interested in taking this on as a side project I can make that happen.  Just buy me a few beers.  Preferably something that looks like orange juice so I know Swirly won’t drink it.
  6. Another myth debunked is that people that post pre-blasts get credit for a Q.  This is an absolute fallacy and was never the case.  In the early days of #bigdata YHC would clean up the data.  Basically this was me exerting my power to delete what I previously called phantom workouts.  TYA was a victim of this one time.  However, it was primarily used to delete preblasts from #bigdata.  This is no longer necessary as long as the author follows a simple rule.  Any backblasts with a date in the future get excluded from #bigdata.  So, if you are writing a preblast, put the date in which the event will actually happen.  Presumably that will be in the future or else by definition it would be a backblast and there would have been a Q and a PAX.
  7. Running Qs should count less than workout Qs.  I disagree with Honeydo on this one and it will remain equally weighted in perpetuity.
  8. The Q ratio was something I added as a humorous poke at Lockjaw when the joke was he was running near 100%.  He’s close.  Little known fact.  If you click on your name, your all-time q-ratio is displayed right above your graph.  The q-ratio on the attendance report is just your q-ratio for that period.  I love this stat as it reminds me when I haven’t Q’d in a while.

I have a very short list of items currently on the backlog.  If you have ideas on how #bigdata can be improved, let me know.  Grab me in a workout, send me a note/text, whatever it takes.  I enjoy this as a hobby and may eventually get around to your request.

Remember, #bigdata is watching.




  1. Great info, Splinter! Thanks for doing this and all the work for #bigdata. To add to the above, please make sure to separate Pax members in a backblast with a comma (,) and only a comma. I had a period (.) placed after one of my previous Roller Coasters(thanks TYA) and it didn’t count toward #bigdata. (Will an alias fix this?)

    Thanks again!

  2. Thanks for taking your time to do what you do and to explain it as well. I will buy you a couple of fruitcake beers next chance I get ?

    You rock, Splinter.

    My phone just did some weird autocorrect stuff but it gave me the idea that you might need an unpaid splintern at some point. Lol

  3. Just curious, do any other regions have big data?

    We should have a new shirt made #BigData is Always Watching!

  4. In various forms. I know HR took this concept and keeps track of data in a completely different way. Is it better? Who knows, every region has it’s own flair.

    When I was creating #bigdata, I did reach out to the nation IT leaders to get some info and to make a few requests. Most were dismissed as they had bigger plans for the f3nation website. They did indicate that some regions to provide some stats but I haven’t seen anything like we have in RVA.

  5. Whew…Lab Rat apologizes. (I complain about a lot of stuff, but I dont think I have ever said anything regarding big data….am I ACTUALLY not on a shit list?!?)

    Nice work though, it’s a totally rocking app. Planning on dusting off the brew pot while stuck at home during the hurricane….you get a bottle or two when it’s ready. Assuming you still drink straight up stouts without rose petals or hemlock additions.

  6. #BigData has Spoken! Respect and gratitude for this newest F3RVA obsession, Splinter. We love it. Next HDHH, Splinter should drink for free.

    Love the shirt idea from EFH too.

  7. In all seriousness nice work Splinter. I am only half kidding about run Qs. But when I am feeling down I can go to Big Data and see I am always 1 post ahead YTD of Lab Rat…

  8. LOL!!!
    2 Can
    To Cannes
    Two Can (LIFO)
    Two Canne