Machine Learning and Technical Interview Newsletter June 03

Jul 19, 2021

Technical Interivews

Rejection:

Rejection sucks. But even the best, the most famous people get rejected. Creator of popular tool homebrew famously tweeted and ranted about how ridiculous the Google interview is.

Max Howell @mxcl

Google: 90% of our engineers use the software you wrote (Homebrew), but you can’t invert a binary tree on a whiteboard so fuck off.

View this skill card to see famous / good developers getting rejected by their dream companies… a lot of them end up somewhere awesome later! These messages will make you feel a LOT better. If you experience negativity during job search, during work, please keep in mind you are not alone.

Returnship:

"(plural returnships) An internship-like program for experienced workers seeking to reenter the workforce after an extended period, particularly in a new line of work." - your dictionary.com

Big techs like Facebook, Google, Amazon offer mother who code, analyst mothers, data scientists mother returnships. These special programs are designed help those who want to return to the workforce after career gaps. Normally career gaps can reduce salary and hinder career development. These programs are designed to help. Career gaps can be hardship, pregnancy, childcare, care giving to elderly/ sick, lost of employment etc. Some companies even offer fertility treatment as perks/benefits.

Behavioral interview questions

Talk about hobbies and extracurricular at interviews.

https://ml.learn-to-code.co/skillView.html?skill=iG4pWJh80ApPfCmsvptb

NEWS and TRENDs Are tech companies leaving San Francisco and New York for Austin?

Track the great tech exodus here.

Common Technical Interview Data Structures

"More precisely, a data structure is a collection of data values, the relationships among them, and the functions or operations that can be applied to the data. Data structures serve as the basis for abstract data types (ADT)" - wikipedia

During interviews you are not tested on the exact implementation details. You will be tested on conceptual knowledge of when to use each structure, what are the pros and cons, and whether it suits the problem statement. Big techs also care about pairing efficient data structures with efficient algorithms.

Here are a few data structures:

https://ml.learn-to-code.co/skillView.html?skill=uDdNicYdPaIvb6nCCtYp

https://ml.learn-to-code.co/skillView.html?skill=VdPdeRsgnCYTI3rZGnfx

https://ml.learn-to-code.co/skillView.html?skill=5yuPtI4yOrzbOmFOAOJF

https://ml.learn-to-code.co/skillView.html?skill=0Pbh0VPwFCf3FakeCIS6

https://ml.learn-to-code.co/skillView.html?skill=yCIkkujpHzwBQ3tjlVoR

We will discuss more in the next newsletter.

Got a question ❓ Message us on the message tab.

New Features

Log in to vote for skill cards. Like a card to save it to the Favorite tab.
Send us messages on the Message tab.
Chinese network friendly (plain HTML no auth) Chinese English Machine Learning Dictionary coming soon.
Computer vision landing page (inprogress)
Pricing page (substack) $5 / month
Tip us, buy coffee ☕️
Launching soon 🚀 sample code
Launching soon 🚀 company landing page, learn more about engineering at xyz company without having to google frantically

Machine Learning tutorials

Machine learning versus conventional programming

As previously mentioned: machine learning differs from conventional programming (no need to give specific step by step control flow instruction). The model learns by updating weights. Different models have different architecture elements. Machine learning and deep learning can be declarative too. We tell the data and the model to move to GPU. We don’t specify how and we don’t need to know. Declarative programming is like writing html. You just have to know what tags to use. There are control flows in our tasks but much fewer compared rule based programming. Using deep learning libraries is also similar to using other APIs. We need to know what info it expects and what kind of info it gives back. Documentations are helpful.

Why is it important to split data?

A quick reminder, data needs to be preprocessed and split. Since we will spend many future tutorials on data processing we will revisit it in the future. Data should be split into three randomized / shuffled / stratified datasets : train, test, validation.

Data for training and selecting machine learning models are split into three parts: 1) training data 2) data used for testing metrics, model selection 3) hold-out dataset that mimics real world data, used last.

`train_test_split()` function is used during the data cleaning phase of the machine learning workflow.

Some refer to validation as test. The first dataset - train is for training the model and updating weights. Models learn by tuning weights and then check answers. It is important to remember whatever the second dataset is called, it is used for testing models and making adjustments and testing performance. The final dataset is for mimicking real world data so it much be representative. It must be a hold out dataset - the model must not have ever seen the data until it is nearly completely done, trained and ready for the final sanity check. Machine learning models are so powerful they can obtain “leaks” from data it has seen at any given time. It may become a bias of the model and decrease its ability to generalize.

Easter eggs for subscribers

Pro members can access premium freebies. Here are some recent ebooks:

https://ml.learn-to-code.co/skillView.html?skill=674gahFoQgi0ihMv9Fsx

https://ml.learn-to-code.co/skillView.html?skill=QIy1G3Se1ZRkrKmcXOch

https://ml.learn-to-code.co/skillView.html?skill=d4rMWbSXLAaCA4kutTOQ

https://ml.learn-to-code.co/skillView.html?skill=mwO6vPVWZW2OtYyK5l7B

High quality cheatsheet for researchers

Additional Resources

Technical interview tips we published on Medium
What’s Level 5 driving? See our Level 5 driving flash card. http://ml.learn-to-code.co/skillView.html?skill=I3O3hd7VyET9u1zUecfy Toyota acquires Lyft’s auto driving unit. We summarize the key points and take-home message in this flash card.

Uniqtech Data Science Bootcamp

Discussion about this post