Streamlit Pocket Book

Streamlit Pocket Book — Uplatz

50 in-depth cards • Wide layout • Readable examples • 20-question interview Q&A included

Section 1 — Foundations

1) What is Streamlit?

Streamlit is a Python framework that turns scripts into shareable web apps with minimal code—ideal for data apps, dashboards, and ML demos. It handles UI, state, reruns, and layout automatically.

pip install streamlit
streamlit hello
# run your app:
streamlit run app.py

2) Minimal App Anatomy

Streamlit executes top-to-bottom on every interaction. Widgets set state, then the script reruns to reflect changes. Use st.title, st.write, widgets, and charts.

import streamlit as st
st.title("Hello Streamlit")
name = st.text_input("Your name")
st.write("Hi", name or "there!")

3) Layout Basics

Use columns, tabs, and expanders to structure content; sidebar for navigation and global controls.

col1, col2 = st.columns(2)
with col1: st.metric("Users", 1200)
with col2: st.metric("Churn", "4.8%", "-0.3pp")

4) State & Reruns

Widgets store values; for custom state, use st.session_state. Changes trigger reruns; guard expensive work with caching.

if "count" not in st.session_state: st.session_state.count = 0
if st.button("Add"): st.session_state.count += 1
st.write("Count:", st.session_state.count)

5) Forms for Controlled Submit

Forms bundle inputs and only trigger a rerun when submitted—useful for multi-field validation.

with st.form("login"):
  user = st.text_input("User"); pwd = st.text_input("Pass", type="password")
  ok = st.form_submit_button("Sign in")
if ok: st.success(f"Welcome {user}")

6) Caching (Results & Resources)

@st.cache_data caches pure-function outputs by args; @st.cache_resource caches heavy objects (DB engines, models). Invalidate with ttl or clear_cache.

@st.cache_data(ttl=600)
def load_df(url): return pd.read_csv(url)

7) Display Primitives

Use st.write, st.markdown, st.code, st.json, st.table, and st.dataframe for rich display. Markdown supports HTML (safe subset).

st.markdown("**Bold** and _italic_"); st.code("print('Hi')", language="python")

8) Charts

Native charts: st.line_chart, st.area_chart, st.bar_chart. Integrates with Altair, Plotly, Matplotlib, PyDeck.

st.line_chart({"x":[1,2,3],"y":[3,1,4]})

9) File Upload/Download

Use st.file_uploader to accept files; create downloadable assets with st.download_button.

up = st.file_uploader("CSV", type="csv")
if up: st.download_button("Echo", up.getvalue(), "copy.csv")

10) Q&A — “Why Streamlit over Flask + React?”

Answer: Streamlit trades custom UI flexibility for speed: data folks ship functional apps fast using only Python, with widgets, state, caching, and hosting handled.

Section 2 — Widgets & Interactions

11) Core Widgets

Inputs: text, number, slider, selectbox, multiselect, date/time, checkbox, radio, color picker.

age = st.slider("Age", 0, 100, 30)
color = st.color_picker("Pick", "#00f")

12) Callbacks

Use on_change to run code when a widget value changes. Useful for validation and dependent widgets.

def normalize(): st.session_state.q = st.session_state.q.strip()
st.text_input("Query", key="q", on_change=normalize)

13) Session State Patterns

Initialize keys, group related state in dicts, and persist user choices between pages. Use st.session_state.clear() to reset.

st.session_state.setdefault("filters", {"country":"All"})

14) Forms vs Live Inputs

Live inputs rerun on every change; forms wait for submit. Use forms for heavy queries and multi-field validation.

with st.form("search"): q = st.text_input("Q"); go = st.form_submit_button("Go")

15) Upload Large Files

Set server limits in config; stream processing with chunks and progress bars for UX.

prog = st.progress(0)
for i in range(100): prog.progress(i+1)

16) Toasts, Status & Spinners

Use st.toast, st.status, and st.spinner to communicate long tasks clearly.

with st.spinner("Crunching..."): time.sleep(2)

17) Keyboard & Hotkeys (Tip)

Wrap hotkeys with st.components.v1.html (JS) or keep it simple—Streamlit favors click interactions.

st.components.v1.html("<script>document.onkeydown=e=>...</script>", height=0)

18) Multi-Page Apps

Create a pages/ folder; each .py becomes a page with its own sidebar nav. Share state via st.session_state and cached resources.

# app.py + pages/01_Explore.py + pages/02_Train.py

19) The Sidebar

Place global filters and navigation in st.sidebar; keep pages clean and focused.

with st.sidebar: st.radio("Theme", ["Light","Dark"])

20) Q&A — “When should I use forms?”

Answer: When inputs are coupled (query + filters) and you want one deliberate submit; also when querying heavy backends to avoid reruns on each keystroke.

Section 3 — Data, Charts, Maps & Media

21) DataFrames

st.dataframe is interactive (sort, filter); st.table is static. Use column_config to format and add links/images.

st.dataframe(df, use_container_width=True)

22) Altair & Plotly

Embed rich charts from Altair/Plotly. Bind widgets to filter data and redraw charts on change.

import altair as alt
st.altair_chart(alt.Chart(df).mark_bar().encode(x="cat", y="val"), use_container_width=True)

23) Matplotlib & Seaborn

Render Matplotlib figures via st.pyplot. Keep figures small; prefer Altair/Plotly for interactivity.

fig, ax = plt.subplots(); ax.plot([1,2,3],[1,4,9]); st.pyplot(fig)

24) Maps with PyDeck

Use PyDeck for WebGL maps and layers (Scatterplot, Hexagon). Great for geo visualizations.

import pydeck as pdk
st.pydeck_chart(pdk.Deck(map_style=None, initial_view_state=pdk.ViewState(latitude=37.76, longitude=-122.4, zoom=10)))

25) Media: Images, Audio, Video

Use st.image, st.audio, st.video to preview content. Combine with uploaders to build media tools quickly.

st.image("plot.png"); st.audio(data, format="audio/wav")

26) Ag-Grid / Data Editors

st.data_editor enables editable tables. For advanced grids, use community components (Ag-Grid) via st.components.

edited = st.data_editor(df)

27) Large Data Tips

Lazy load, paginate, sample, or aggregate. Cache results; stream results to the UI incrementally with st.empty.

slot = st.empty()
for chunk in read_big(): slot.dataframe(chunk)

28) File System Access

Use st.file_uploader or mount volumes when deploying. Avoid blocking I/O; offload to threads for responsiveness.

29) Theming Charts

Match dark/light mode with chart themes (Plotly templates, Altair themes). Respect Streamlit theme settings.

st.plotly_chart(fig, use_container_width=True)

30) Q&A — “When use PyDeck vs Altair?”

Answer: PyDeck for heavy geospatial layers and 3D; Altair for statistical charts. Use both—Altair for summaries + PyDeck for spatial drill-down.

Section 4 — ML Apps, LLMs, Backends & Deployment

31) Model Inference App

Load a pickled model/cache the object; predict on user input. Avoid reloading per rerun by using @st.cache_resource.

@st.cache_resource
def get_model(): return joblib.load("model.pkl")
model = get_model(); st.write(model.predict([[sepal, petal]]))

32) Batch Jobs & Progress

Long tasks: show progress, stream logs to UI, and let users download results. Consider background workers for heavy tasks.

log = st.empty()
for i in range(100): log.text(f"Step {i}"); time.sleep(.05)

33) LLM Chat UI Scaffold

Build a chat with message history and streaming tokens. Wrap API calls with timeouts/retries; cache embeddings if used.

if "chat" not in st.session_state: st.session_state.chat=[]
msg = st.chat_input("Ask…")
if msg: st.session_state.chat.append(("user", msg))
for role, text in st.session_state.chat: st.chat_message(role).write(text)

34) RAG Pattern

Index docs → embed → search top-k → feed to LLM. Cache vector index and document chunks as resources.

# pseudo: vectors = embed(chunks); hits = search(q); prompt = fmt(hits, q)

35) Auth Basics

Use secrets, simple login forms, or reverse proxy auth. For enterprise, front with an auth layer (Auth0/Okta) and pass headers.

if st.secrets.get("ADMIN_PASS") == st.text_input("Pass", type="password"): st.success("ok")

36) Secrets & Config

Place keys in .streamlit/secrets.toml; access via st.secrets. Avoid committing secrets to VCS.

api_key = st.secrets["openai"]["key"]

37) Databases

Cache the engine/connection; use parameterized queries; stream results to avoid memory blowups.

@st.cache_resource
def get_engine(): return create_engine(st.secrets["db_url"])

38) Multiprocessing & Threads

For CPU-bound tasks, delegate to processes; for I/O use threads/async libs. Communicate back via polling UI placeholders.

39) Deployment Options

Deploy on Streamlit Community Cloud, containerize (Docker), or run behind Nginx/Gunicorn. Configure caching, secrets, and health endpoints.

# Dockerfile (sketch)
FROM python:3.11-slim
COPY . /app; WORKDIR /app
RUN pip install -r requirements.txt
CMD ["streamlit","run","app.py","--server.port=8501","--server.address=0.0.0.0"]

40) Q&A — “How to keep apps responsive?”

Answer: Cache heavy functions, precompute expensive data, batch API calls, stream progress, and avoid blocking the main thread (offload to workers).

Section 5 — Testing, Theming, Tips & Interview Q&A

41) App Structure

Organize into app.py, pages/, components/, lib/. Keep UI thin; put logic in functions for testability.

lib/
  data.py   # IO & caching
  charts.py # chart builders
components/
  filters.py # small UI pieces

42) Theming

Customize via .streamlit/config.toml (primary colors, font). Prefer app-level consistency over per-widget styling.

[theme]
base="light"
primaryColor="#0ea5e9"
font="sans serif"

43) Accessibility

Use semantic headings, alt text for images, keyboard-friendly widgets, and high-contrast themes.

st.title("Sales Dashboard"); st.caption("Accessible by design")

44) Testing & CI

Unit test logic modules; smoke test app launch in CI; lint with Ruff/Flake8; pin requirements for reproducibility.

pytest -q; streamlit run app.py --server.headless true

45) Performance Checklist

Cache I/O and models, paginate large tables, avoid per-row Python loops (vectorize), pre-aggregate, compress images, and memoize chart builders.

@st.cache_data(show_spinner=False)

46) Common Pitfalls

Recreating DB clients on every rerun, missing keys in session_state, doing heavy work on each keystroke, storing secrets in code, and not handling file types.

assert "key" in st.session_state, "Initialize first!"

47) Custom Components

Embed custom JS/HTML or build components via Streamlit’s component API (React-based). Use when built-ins aren’t enough.

st.components.v1.html("<div>Custom widget</div>", height=60)

48) Security & Privacy

Validate uploads, sanitize HTML, throttle external API calls, scrub PII in logs, and set CORS/headers at proxy level when public.

49) Production Ops

Monitor with health pings, log errors/latency, set memory/CPU limits, autoscale containers if traffic spikes, and add feature flags.

50) Interview Q&A — 20 Practical Questions (Expanded)

1) Why Streamlit? Rapid Python-only UI for data/ML—fast iteration, low ceremony, built-in hosting options.

2) How does rerun work? Every widget interaction triggers a top-to-bottom script rerun; the UI reflects current state.

3) session_state use cases? Persist selections, wizard steps, auth flags, and cross-page state.

4) cache_data vs cache_resource? data caches function outputs (pure), resource caches heavy objects (clients/models).

5) Forms vs live inputs? Forms batch inputs and submit once; live inputs rerun on every change—use forms for heavy backends.

6) Handling long tasks? Spinners, progress, log streaming, and background workers; avoid blocking UI.

7) Multiple pages? Place files in pages/; share resources via caches and session_state.

8) Chart libraries? Native charts, Altair, Plotly, Matplotlib, PyDeck; choose based on interactivity and data size.

9) Deploy options? Streamlit Cloud, Docker, or your server behind a reverse proxy; set secrets and config.

10) Auth strategies? Simple password in secrets, OAuth via proxy, or custom components; Streamlit itself is framework-agnostic.

11) File uploads at scale? Raise limits, stream chunks, validate types, and process asynchronously if heavy.

12) Prevent expensive recompute? Cache results, memoize chart data, and use forms/callbacks instead of per-keystroke reruns.

13) How to persist selections? Initialize keys, write callbacks to sync values, and store JSON in session_state.

14) Dataframe performance? Sample, paginate, or summarize; avoid rendering 100k+ rows; use data_editor selectively.

15) Integrating databases? Cache connections, parameterize queries, stream results, and handle timeouts.

16) Building LLM chat? Keep a messages list, stream tokens, throttle calls, cache embeddings for RAG.

17) Custom components vs built-ins? Use built-ins first; reach for components when you need bespoke UI/JS behavior.

18) Theming best practice? Configure global theme in .streamlit/config.toml and keep palettes consistent.

19) Testing Streamlit apps? Unit test logic; smoke test app launch; consider Playwright for E2E user flows.

20) Common mistakes? No caching, recreating resources per rerun, mixing secrets into code, giant tables, and blocking calls without feedback.