On this page
Large Language Models
Resources and Links
- Dan’s Lambda School DS Youtube Playlist
- (probably dated but maybe good background on certain topics?)
- Daniel Whitenack’s (DW) go-genai-webinar repo
- seminar presentation: link
- Youtube seminar video (sponsored by Arden Labs): Youtube link
- Prediction Guard Documentation - some open models available
- Cohere: https://cohere.com/ | Cohere brings you cutting-edge multilingual models, advanced retrieval, and an AI workspace tailored for the modern enterprise — all within a single, secure platform
- DW suggested links:
- For building personal / low expense outlay look at HuggingFace (ZeroGPU program?)
- DeepNote: AI driven notebooks - https://deepnote.com/
- Data Science from Scratch code: https://github.com/joelgrus/data-science-from-scratch
Notes on Large Language Models
- (Very) High Level Notes on LLM Execution
- LLMs take a prompt and then calculate probabilities of words (tokens?) that should follow each other
- For prompt
Go is...the LLM may generate these words in descending order of probability:aprogramminglanguage- …
- NOTE: this is very similar to autocomplete
- LLMs will calculate probabilities for every word (token?) it knows about (DA: scale seem massive)
- LLM
Temperatureconfiguration setting: sounds like it drives some sort of variablity into the output so that the results are not always driven by strict probabilites (this would be boring or tend to lack creativity)
- Enterprises are gradually migrating to use Open LLMs (e.g. Llama 3, deepseek, Mistral) from Closed LLMs (e.g. OpenAI [ChatGPT], Anthropic [Claude])
- Daniel Whitenack’s spectrum of AI complexity.
- Basic Prompting
- Prompt Engineering (CoT, templates, parameters)
- Augmentation, Retrieval
- Agents, Chaining
- Fine-tuning via a closed API
- Fine-tuning an open model
- Training a model from scratch
- Prompts
- LLMs are tuned to use prompts that are formatted in specific ways (rather than a simple text question)
- Accuracy
- Autocomplete LLMs focus on coherency so they answers may be coherent but not necessary accurate. PredictionGuard uses factual consistency checking models to confirm accuracy.
- For example:
The White House is painted pink. (the sentence is coherent not accurate)
- For example:
- Autocomplete LLMs focus on coherency so they answers may be coherent but not necessary accurate. PredictionGuard uses factual consistency checking models to confirm accuracy.
- Training Models
- DW: “You should never, ever, ever have to train a model for the rest of your life”
- You should…
- Use an open model and inject your data into the prompt
- At most, you may need to fine tune a model
- You should…
- DW: “You should never, ever, ever have to train a model for the rest of your life”
LLM Model API Behavior
- Daniel Whitenack: most APIs including PredictionGuard (Daniel’s company) will start streaming completion text immediately and essentially stream/spit it out serially
- NOTE: In the PredictionGuard go code, they use a channel to receive that stream (and then your go program can start print it?)
Embeddings / Vector Representations
- See https://cohere.com/ for a service to generate / capture embeddings
Prompts
- Prompt Formatting: various models are trained to handle prompts with specific text formatting. Structuring prompts in this way should optimize execution (?)
ChatML Format
- the actual text would go into the curly brace area
<|im_start|>system
{prompt}<|im_end|>
<|im_start|>user
{context or user message}<|im_end|>
<|im_start|>assistant<|im_end|>
Large Language vs. Foundation Models
From ChatGPT Summary of differences between Large Language and Foundation Models
| Feature | Large Language Models (LLMs) | Foundation Models (FMs) |
|---|---|---|
| Scope | Focused on text-based tasks | Can handle text, images, video, and more |
| Training Data | Large text datasets | Multimodal datasets (text, images, video, audio) |
| Use Cases | Chatbots, code generation, text summarization | Image generation, speech recognition, robotics, multimodal AI |
| Examples | GPT-4, BERT, LLaMA | GPT-4V, CLIP, Gemini, DALL·E |
Summary
- LLMs are a subset of Foundation Models that specialize in language processing.
- Foundation Models are more general-purpose and can handle multiple types of data (text, images, audio, etc.).
- Many LLMs (like GPT-4) can also be considered Foundation Models because they serve as a base for fine-tuning.
View this page on GitHub