← All threads

Is any bootcamp releasing H2 2022 data yet? Anyone want to scrape/parse LinkedIn for placement data?

3 of Michael's comments in this thread · View thread on Reddit ↗

u/michaelnovati replied · ★ FEATURED
I'm too busy to help but I can give directional advice on where to look. Well over a year ago I got fascinated by Codesmith and OSLabs as a bunch of resumes crossed my desk with months and years of experience that turned out to be 2-3 commits over 3 weeks. I sampled 200 LinkedIn and GitHub profiles that showed how most graduates exaggerated their work compared to their GitHub contributions. I can easily show this method if a group of people want to reproduce.

u/customheart wrote (the comment Michael replied to):

Interesting! Care to share this in a Google sheet via DM? :) Don't understand how they could have stated that experience was months of work unless it was on a different account, or the very first commit took ages? I have a work github and a bootcamp/personal github account. Thei

u/michaelnovati replied · · edited ★ FEATURED
I don't plan on sharing the actual spreadsheet or any personal information because I respect all/most of the individual Codesmith alumni I know very deeply and don't want to be a part of any individual person being called out out focused on negatively. In fact I help a number of Codesmith alumni on a 1-1 basis figure out options and paths forward in this market and I know how unique and person these convos can get and "exposing people" would be against my goals. Yeah I'm EXTREMELY good at cmd+click cmd+tab, etc... lol. Apple Automations and formulas in Sheets can help. For example if you have a GitHub handle, you can make a formula to great a url concatenated with that handle, then copy those into a text file and open in the Terminal, and then cmd+ double click on each one to open in a new tab. 5 mins to get 200 GitHub pages open. And then say we want to go to a tab in the same location on all profiles, then cmd + \] + click while not even moving the mouse and toggle through 200 tabs in a minute and get them all to a certain tab. If you want to look at the top ones, look at Codesmith and HR. If you want to start with Codesmith, you can just go to [https://github.com/oslabs-beta/](https://github.com/oslabs-beta/) and [https://github.com/open-source-labs/](https://github.com/open-source-labs/) , make a list of all the projects modified in the past \~ 1 year. Then find the webpage/website for each one by googling the link to the project, and most of the websites have a list of team members with LinkedIn + GitHub and you can cmd+ click to open all of those. Takes about 20 mins to get a few hundred LI profiles open. Then you can make a spreadsheet with a list of urls and if they have a job reported on LI or not. When I did it I noted the number of months and the dates they listed for their OSP. Then I repeated with the number of commits and time delta between first and last commit on their OSP. Average time claimed working on OSP on LI was a range of 6 to 12 months, and the average number of commits was between 2-3 over 3 weeks. (Ranges because LI has ranges). Look at a bit project like Swell: [https://github.com/open-source-labs/Swell/graphs/contributors](https://github.com/open-source-labs/Swell/graphs/contributors) and see the commit patterns. Every commits a little spike, 2-3 commits over 3 weeks and then they claim 3 - 12 months of experience on the LI. When I saw the pattern of the vast majority of people doing it I started chiming alarm bells that something weird was going on. Numerous people sent me internal docs showing Codesmith told people not to lie about their experience so I've been intrigued ever since to get to the bottom of how and why this happens.

u/customheart wrote (the comment Michael replied to):

I tried ChatGPT. It produced something that made sense when I glanced at it but I haven't run it yet. There is also someone's 2 yr old code on Github which might not work anymore. I figured asking here might be more fruitful or someone would know a better data source.

u/michaelnovati replied ·
There's no short cut here for hard work. It's a lot of work to dig this stuff up but you can use programming thinking to make it a lot faster :D I commented above/below about some ideas.