Introduction to Hugging face
In this month’s bonus newsletter, we will talk about a fast growing AI platform Hugging Face 🤗 (HF), its inaugural AI party at the exploratorium — (a HBO Silicon Valley party moment), how to use it to be build AI applications fast.
What is Hugging Face? It’s one of the hottest generative AI Companies developers should know. It is a multi-purpose model hub for open-sourced, trending deep learning models. Hugging Face provides open-source alternatives to popular proprietary models. Is the future of AI open sourced? Even Google feels threatened by the rise of sites like HF. Fast Company reports, in recently leaked internal emails a senior Google Engineer warns of threats from open sourced innovations in AI (It is unclear if it is reliable information). Did you know Google AI models used to be the foundation of open-sourced AI? Google is the leading author of quite a few landmark papers and foundational models such as BERT and PaLM.
Hugging Face makes its mission to provide open-source, community-driven AI. It has more than 268K total number of stars on Github, which makes it 25th most starred organization of all time on @github on git star ranking. Source.
HF is a popular AI development platform and development tool. AI developers can use Hugging Face to accelerate AI development, deployment, save time, no need to reinvent the wheel. Its transformer and diffuser models are especially popular. It’s possible to build your own large language model and generative model using HF.
Motivation — Hugging Face Party and AI Trends, VC Funding
🤗 + 🦙+🎉
Need some motivation first? Read about the Hugging Face AI party here 🏖️ 🍾 Nov 15, 2022 :
The party was organized incredibly within one week, haphazardly on twitter, and managed to garner 5000 attendees. At first, I thought it was a joke. Many, like myself, arrived early, and were eager to get into the venue (San Francisco Exploratorium, an innovator’s museum). It was a full house. This is something I have never seen before. When organizing events at Stanford, I used to budget for a 30% attrition rate on the event date. The food and drinks are lamer than most tech events but the venue is great. Each person gets two drink tickets: beers, or you can choose from table wines or an aluminum bottled water. I wouldn’t be surprised if they were trying to recreate the party scene in Silicon Valley HBO, without the extravagance. VentureBeat also wrote about the party
View the Hugging Face AI party video here (linked to below): on Youtube.
Want to read more content like this — developer, tech, startup lifestyle? Our next newsletter is all about tech parties and co-working spaces. Subscribe to our newsletters : Substack, Medium.
It was also the best time to meet the most influential people in AI. Times Magazine just released its list of 100 most influential people in AI. At this party, you can meet plenty of them e.g. Andrew Ng (founder of Coursera), and the next generation of AI leaders.
Hugging Face also gave attendees the opportunity to apply and be featured in the expo — at the open hall of the event. I thought Deepchecks was useful. The developer engineer at the event was particularly helpful in demonstrating the utilities: “Everything You Need for Continuous Validation of LLMs & AI Testing. CI/CD. Monitoring A comprehensive validation solution for every phase in the lifecycle — because AI systems are more fragile than you think. All based on our open-source core”.
They also brought llamas 🦙🦙 🦙to attend the event! Meet the Llamas and Alpaca … models: Hugging Face hired real life llamas to attend its AI party https://ml.learn-to-code.co/skillView.html?skill=juzQSGgbErp6TFhinsMw
Want to learn more about llamaindex? Check out our Uniqtech Guide to llamaindex, llama and alpaca model. By the way, Meta (Facebook) released llama2.
Are you excited about AI? Generative AI is definitely the buzzword in 2024. Learn AI with our helpful flash cards that can save you time and keep you up to date with the latest in AI. Did you know that OpenAI reportedly makes ~ $1 billion dollars annually in revenue (Fast Company) ?!
Warning: this is just one calculation and estimation, may not be reliable. Do your own research.
Hugging Face in the news: "Hugging Face raises $235M from investors, including Salesforce and Nvidia" TechCrunch August 24, 2023 - Source: TechCrunch
NVIDIA collaborates with Hugging Face and provides AI computing. "NVIDIA and Hugging Face ... announced a partnership that will put generative AI supercomputing at the fingertips of millions of developers building large language models (LLMs) and other advanced AI applications." - Source: NVIDIA Blog "As part of the collaboration, Hugging Face will offer a new service — called Training Cluster as a Service — to simplify the creation of new and custom generative AI models for the enterprise. Powered by NVIDIA DGX Cloud, the service will be available in the coming months" NVIDIA Blog August 8, 2023"
Full guide to Hugging Face
Here's a summary (see the full guide, linked to below): HF functionalities are abstracted into TASKS. To use a model in HF, first initiate a pipeline object with the task name as the parameter. For example, initiate a classifier for the image classification task using this code `import pipeline; clf = pipeline("image-classification”)`. In this case the mode name is "image-classification”. Another model name is "sentiment-analysis".
Initializing / loading a typical pipeline, a build-in model. Give it input texts, and print out the result using this code snippet:
from transformers import pipeline
classifier = pipeline("model-name")
result = classifier("Your input text here .")
print(result)
Read more about tasks at this link https://huggingface.co/tasks (also linked to in the flash card below).
A typical workflow to initialize a pipeline, a build in model and supply it with labels (candidates) look like this:
# classification example
from transformers import pipeline
classifier = pipeline("zero-shot-classification")
res = classifier("This tutorial is about Hugging Face", candidate_labels=["news", "education", "technology"],)
print(res)
Hugging Face also contains other useful utility functions. This import statement mentions different tokenizers, and models. `from transformers import AutoTokenizer, AutoModelForSequenceClassification, BertTokenizer, BertModel`
Part 1 Uniqtech Guide to Deep Dive on Hugging Face Part 01. Getting Started https://ml.learn-to-code.co/skillView.html?skill=7nxypHNkHZaN88UktRpm
Part 1.5 Uniqtech Guide to Deep Dive on Hugging Face Part 1.5. Introduction to Hugging Face API https://ml.learn-to-code.co/skillView.html?skill=FFg3LaDVw73lSA9d20sQ
What is Hugging Face? Getting started with Hugging Face state of art NLP models, transformers [pro] https://ml.learn-to-code.co/skillView.html?skill=0KAxS5PW1JQEtH8xCZ19
Models on Hugging Face
As of April 19, 2023 there are 181,245 available on the model hub via the community. Pro tip: you can see this stat by visiting the model hub of Hugging Face https://huggingface.co/models models tab. More on Hugging Face Model Hub: There are many community built models in the model hub. Search, filter different models by characteristics. It is easily the largest model hub out there. Warning : please do your own research. For example, HF offers open models that may allow academic, education usage and or commercial usage.
Hugging Face provides famous models, well-known open-source models, readily available algorithms, abstracted into tasks. Developers can use pre-trained models, and or train the models on Hugging Face, fine tune, and supply it with new prompts / data. For example, transformer models provide state of the art (SoA) pre-trained models and APIs for training the models.
Hugging Face provides high level access to the models. Hugging Face hosts implementations of some open-source models (some with limitations) as well as some state of art (SoA) models. Developers can leverage existing models, configurations, containers already hosted on the Model Hub. Hugging Face models are also available on AWS.
Hugging Face Tasks - Loading and using Hugging Face models https://ml.learn-to-code.co/skillView.html?skill=be4gkxNQIDZd3Xw8Tbcs
Transformer and stable diffusion models are the most popular models on the platform as of 2023. Transformer models include GPT-2 and the previous State-of-Art (SoA) BERT. Another popular, powerful example model on Hugging Face: BLOOM 3b Model - BigScience Large Open-science Open-access Multilingual Language Model Model Card https://huggingface.co/bigscience/bloom-3b LlamaIndex is also available on Hugging Face. Source: https://huggingface.co/llamaindex
Transformer models https://huggingface.co/docs/transformers/task_summary
Stable diffusion models https://huggingface.co/docs/diffusers/v0.14.0/en/stable_diffusion
Another example model available in the hub is Bloomz. What is Bloomz (definition)? Bloomz is the more useful (practical) cousin of Bloom LLM which has billions of parameters. Can fine tune on instruction dataset.
To facilitate model training, HF also provides common datasets on its platform. View all Hugging face dataset collection here: https://huggingface.co/datasets Example The Pile of Law dataset https://huggingface.co/datasets/pile-of-law/pile-of-law
Hugging Face founder Clem, featured on Times Magazine Time 100 AI, shows us how to track trending models, datasets, spaces on Hugging Face in this tweet.
https://twitter.com/clementdelangue/status/1701381804274684150
Popular models: falcon, llama2, stable diffusion, transformer, FLM (not in order). Popular datasets: ChatGPT prompts, ShareGPT-Chinese-English-90K, goodwiki (not in order). "Trending models (http://hf.co/models), datasets (http://hf.co/datasets) and apps (http://hf.co/spaces) of the week!"
Intermediate Hugging Face Features and Best Practice
In this deep dive below 👇 we talk about how to save time building AI apps on HF. Using HF functionalities, developers can build AI apps with turbo speed.
Using Hugging Face as infrastructure for AI app development
Part 2 Uniqtech Guide to Deep Dive on Hugging Face 02 [pro] [paid members] https://ml.learn-to-code.co/skillView.html?skill=ZOWtgGa3vMoliRjf5lEG
Part 2.5 Uniqtech Guide to Deep Dive on Hugging Face 2.5 [pro] [paid members] https://ml.learn-to-code.co/skillView.html?skill=3l5CjFGxvwPwiRWlnHwA
Part 3 Easter egg ebook: developers’ guide to Hugging Face. Uniqtech Guide to Deep Dive on Hugging Face 03 - How to use Hugging Face to build apps [ebook] [pro] [easter egg] https://ml.learn-to-code.co/skillView.html?skill=DCTwUZJnWgUE6Zo9r2JZ
Check this out, Hugging Face's alternative to GPT-3 Open Source Alternatives to GPT-3, ChatGPT [public, easter egg, pro tip] https://ml.learn-to-code.co/skillView.html?skill=KSdGxvVRvUkXs3FRicyK
Hugging Face is also available on AWS SageMaker.
Transfer learning with Hugging Face. The basic transfer learning workflow is simple. Choose the right model and download it. Fine tune as needed. Most of the models on HF is plug-n-play ready. Transfer Learning Workflow with Hugging Face https://ml.learn-to-code.co/skillView.html?skill=9DyvRYGieJh2sejVWyIj
Advanced Hugging Face Features and Best Practice
Pro flash cards are for paid members only.
The most important role Hugging Face serves is to simplify MLOps (DevOps for Machine Learning), saving developers time setting up infrastructure, model training and hosting.
Advanced Hugging Face features, offerings, use cases [pro] https://ml.learn-to-code.co/skillView.html?skill=HgL7SFn6BgyXXBNEq6o6
Full Course for Hugging Face [Easter Egg] : NLP state-of-art transformers tutorial
https://ml.learn-to-code.co/skillView.html?skill=fgMl8Ag2RWJZEm3S9YBB
Hugging Face Endpoints
https://ml.learn-to-code.co/skillView.html?skill=wgguscUvt28jDDPEe8Om
Appendix
Introduction - Hugging Face basics [pro] https://ml.learn-to-code.co/skillView.html?skill=l7PgV1pnZBBCO611NC8f
Models - DistilBERT on Hugging Face. What's DistilBERT? [definition] [pro] https://ml.learn-to-code.co/skillView.html?skill=8AKvHXZEMGQm8AwXxLfr
Models - Stable Diffusion on Hugging Face. Play with stable diffusion using the playground here (linked to below). https://huggingface.co/spaces/stabilityai/stable-diffusion