Machine Learning and Technical Interview Newsletter June 03
Technical Interivews
Rejection:
Rejection sucks. But even the best, the most famous people get rejected. Creator of popular tool homebrew famously tweeted and ranted about how ridiculous the Google interview is.
View this skill card to see famous / good developers getting rejected by their dream companies… a lot of them end up somewhere awesome later! These messages will make you feel a LOT better. If you experience negativity during job search, during work, please keep in mind you are not alone.
Returnship:
"(plural returnships) An internship-like program for experienced workers seeking to reenter the workforce after an extended period, particularly in a new line of work." - your dictionary.com
Big techs like Facebook, Google, Amazon offer mother who code, analyst mothers, data scientists mother returnships. These special programs are designed help those who want to return to the workforce after career gaps. Normally career gaps can reduce salary and hinder career development. These programs are designed to help. Career gaps can be hardship, pregnancy, childcare, care giving to elderly/ sick, lost of employment etc. Some companies even offer fertility treatment as perks/benefits.
Behavioral interview questions
Talk about hobbies and extracurricular at interviews.
https://ml.learn-to-code.co/skillView.html?skill=iG4pWJh80ApPfCmsvptb
NEWS and TRENDs Are tech companies leaving San Francisco and New York for Austin?
Track the great tech exodus here.
Common Technical Interview Data Structures
"More precisely, a data structure is a collection of data values, the relationships among them, and the functions or operations that can be applied to the data. Data structures serve as the basis for abstract data types (ADT)" - wikipedia
During interviews you are not tested on the exact implementation details. You will be tested on conceptual knowledge of when to use each structure, what are the pros and cons, and whether it suits the problem statement. Big techs also care about pairing efficient data structures with efficient algorithms.
Here are a few data structures:
https://ml.learn-to-code.co/skillView.html?skill=uDdNicYdPaIvb6nCCtYp
https://ml.learn-to-code.co/skillView.html?skill=VdPdeRsgnCYTI3rZGnfx
https://ml.learn-to-code.co/skillView.html?skill=5yuPtI4yOrzbOmFOAOJF
https://ml.learn-to-code.co/skillView.html?skill=0Pbh0VPwFCf3FakeCIS6
https://ml.learn-to-code.co/skillView.html?skill=yCIkkujpHzwBQ3tjlVoR
We will discuss more in the next newsletter.
Got a question ❓ Message us on the message tab.
New Features
Log in to vote for skill cards. Like a card to save it to the Favorite tab.
Send us messages on the Message tab.
Pricing page (substack) $5 / month
Launching soon 🚀 sample code
Launching soon 🚀 company landing page, learn more about engineering at xyz company without having to google frantically
Machine Learning tutorials
Machine learning versus conventional programming
As previously mentioned: machine learning differs from conventional programming (no need to give specific step by step control flow instruction). The model learns by updating weights. Different models have different architecture elements. Machine learning and deep learning can be declarative too. We tell the data and the model to move to GPU. We don’t specify how and we don’t need to know. Declarative programming is like writing html. You just have to know what tags to use. There are control flows in our tasks but much fewer compared rule based programming. Using deep learning libraries is also similar to using other APIs. We need to know what info it expects and what kind of info it gives back. Documentations are helpful.
Why is it important to split data?
A quick reminder, data needs to be preprocessed and split. Since we will spend many future tutorials on data processing we will revisit it in the future. Data should be split into three randomized / shuffled / stratified datasets : train, test, validation.
Data for training and selecting machine learning models are split into three parts: 1) training data 2) data used for testing metrics, model selection 3) hold-out dataset that mimics real world data, used last.
`train_test_split()` function is used during the data cleaning phase of the machine learning workflow.
Some refer to validation as test. The first dataset - train is for training the model and updating weights. Models learn by tuning weights and then check answers. It is important to remember whatever the second dataset is called, it is used for testing models and making adjustments and testing performance. The final dataset is for mimicking real world data so it much be representative. It must be a hold out dataset - the model must not have ever seen the data until it is nearly completely done, trained and ready for the final sanity check. Machine learning models are so powerful they can obtain “leaks” from data it has seen at any given time. It may become a bias of the model and decrease its ability to generalize.
Easter eggs for subscribers
Pro members can access premium freebies. Here are some recent ebooks:
https://ml.learn-to-code.co/skillView.html?skill=674gahFoQgi0ihMv9Fsx
https://ml.learn-to-code.co/skillView.html?skill=QIy1G3Se1ZRkrKmcXOch
https://ml.learn-to-code.co/skillView.html?skill=d4rMWbSXLAaCA4kutTOQ
https://ml.learn-to-code.co/skillView.html?skill=mwO6vPVWZW2OtYyK5l7B
High quality cheatsheet for researchers
Additional Resources
What’s Level 5 driving? See our Level 5 driving flash card. http://ml.learn-to-code.co/skillView.html?skill=I3O3hd7VyET9u1zUecfy Toyota acquires Lyft’s auto driving unit. We summarize the key points and take-home message in this flash card.