Julia Pocket Book

Julia Pocket Book — Uplatz

50 in-depth cards • Wide layout • Expanded examples • Last card = 20 Interview Q&A

Section 1 — Foundations

1) What is Julia?

Julia is a high-level, high-performance programming language for technical and numerical computing. It blends Python-like syntax with C-like speed via JIT compilation on LLVM. Core design goals: multiple dispatch, composability, and performance without sacrificing productivity.

julia> 1 + 1
2
julia> typeof(1.0), typeof(1)
(Float64, Int64)

2) Installing Julia & REPL Basics

Download binaries (Windows/macOS/Linux) or use a package manager. The REPL supports pkg> mode for package mgmt, help> for docs, and shell mode with ;.

# Enter Pkg mode:
julia> ]
(@v1.x) pkg> add DataFrames Plots

3) Project Environments

Isolate dependencies per project with Project.toml + Manifest.toml. Activate an env and keep reproducible builds.

(@v1.x) pkg> activate MyProject
(MyProject) pkg> add CSV DataFrames

4) Variables & Types

Types are optional; annotate when you need performance guarantees or dispatch control. Everything has a concrete type; abstract types exist for hierarchies.

x = 3.14        # Float64
y::Int = 42     # Type-annotated
z = "hi"        # String

5) Multiple Dispatch (Core Idea)

Functions dispatch on the types of all arguments, enabling clear generic code and specialized fast paths.

area(r::Real) = π * r^2
area(a::Real, b::Real) = a*b
area(2), area(3,4)  # different methods, same name

6) Unicode, Math & Broadcasting

Use Unicode in identifiers; broadcasting (.) vectorizes functions over arrays without writing loops.

θ = 0.5; f(x) = sin(x) + cos(x)
xs = 0:0.1:1
ys = f.(xs)   # broadcast

7) Control Flow

Standard if/elseif/else, for, while, and try/catch. return optional at end.

function signum(x)
  x > 0 ? 1 : x < 0 ? -1 : 0
end

8) Functions & Do-Block Syntax

Anonymous functions, short function definitions, do-blocks for passing functions as the last arg.

g(x) = x^2
map(x -> x^2, 1:5)
open("out.txt","w") do io
  write(io, "Hello")
end

9) Modules & Namespaces

Organize code as modules; control exports; avoid name clashes. Use include to split files.

module Geo
export area
area(r) = π*r^2
end
using .Geo: area

10) Packages & Precompilation

Packages are precompiled for faster loading. For app speed-up, precompile dependencies and consider sysimages (PackageCompiler.jl).

(@v1.x) pkg> add PackageCompiler
julia> using PackageCompiler

11) Documentation & Help

Use docstrings and the ? help mode. Add examples and signatures for clarity.

"""
  area(r)

Compute circle area.
"""
area(r) = π*r^2

12) Testing & CI

Built-in Test stdlib for unit tests; integrate with GitHub Actions for CI on multiple Julia versions.

using Test
@test area(1) ≈ π

Section 2 — Language Essentials

13) Type System: Abstract vs Concrete

Concrete types can be instantiated and are required for performance-critical fields. Abstract types define interfaces and hierarchies.

abstract type Animal end
struct Dog <: Animal; name::String; end
bark(a::Dog) = println("Woof, $(a.name)!")

14) Parametric Types

Define generic containers and algorithms. Specialize on element types for performance.

struct Bucket{T}
  items::Vector{T}
end
Bucket{Int}([1,2,3])

15) Multiple Dispatch Patterns

Prefer method tables over conditionals. Add specialized methods for hot paths; keep generic fallback.

distance(a::Real, b::Real) = abs(a-b)
distance(a::AbstractVector, b::AbstractVector) = norm(a-b)

16) Mutability & Immutability

struct is immutable by default (fast, stack-friendly). Use mutable struct when fields must change.

struct Point; x::Float64; y::Float64; end
mutable struct Box; p::Point; end

17) Arrays & Views

Julia arrays are column-major (like Fortran). Use views to avoid allocations when slicing large arrays.

A = rand(5,5)
v = @view A[1:3, 2]   # no copy
sum(v)

18) Dictionaries & Sets

Hash-based collections with parametric key/value types. Prefer concrete key/value types for speed.

d = Dict{String,Int}("a"=>1, "b"=>2)
push!(keys(d), "x") # iterator tricks via collect(keys(d))

19) Comprehensions & Generators

Concise constructs that avoid explicit loops; generators stream elements lazily.

squares = [x^2 for x in 1:10]
g = (x^2 for x in 1:10); sum(g)

20) Error Handling

Throw exceptions for exceptional states, not control flow. Add specific catch clauses.

try
  error("Oops")
catch e
  @warn "Caught" e
end

21) Modules & Include Strategy

Use a top-level module per package; include sub-files inside the module; avoid globals in code paths.

module MyPkg
include("io.jl"); include("math.jl")
end

22) Metaprogramming 101

Expressions (Expr) and macros transform code before evaluation. Use macros to remove boilerplate or insert checks.

@time sum(rand(10^7))   # macro expands to timing code

Section 3 — Performance, Parallelism & Interop

23) Performance: Key Rules

Type-stable functions, avoid global mutation, minimize allocations, use in-place (!) versions, and keep concrete container element types.

function sum_inplace!(y, x)
  @inbounds @simd for i in eachindex(x)
    y[i] += x[i]
  end
end

24) Benchmarking

Use BenchmarkTools.jl for reliable timings (warming, sampling, avoiding global scope).

using BenchmarkTools
@btime sum(rand(10^6));

25) Profiling & Flamegraphs

Use Profile stdlib and tools like ProfileView or StatProfilerHTML to find hot spots.

using Profile
@profile heavy_compute()
Profile.print()

26) Memory Allocations

Inspect allocations with @time and @btime. Replace temp arrays with views or preallocated buffers.

@time map(sin, rand(10^6))  # watch allocations

27) Multithreading

Set JULIA_NUM_THREADS. Use Threads.@threads loops and thread-safe data structures.

using Base.Threads
sum = 0.0
@threads for i in 1:10^6
  @atomic sum += sin(i)
end

28) Distributed Computing

Add processes with addprocs; use @distributed and pmap for cluster-like tasks.

using Distributed; addprocs(4)
@everywhere f(x)=x^2
@time @distributed (+) for i in 1:10^7
  f(i)
end

29) GPU with CUDA.jl

Move arrays to GPU with CuArray; broadcast and linear algebra run on GPU. Write custom kernels if needed.

using CUDA
x = CUDA.rand(10^6); y = CUDA.rand(10^6)
z = x .+ y  # runs on GPU

30) Interop: Python & R

Call Python via PyCall.jl and R via RCall.jl. Great for reusing mature libs.

using PyCall
np = pyimport("numpy")
np.mean([1,2,3])

31) Interop: C/C++

Use ccall to call C; for C++ consider CxxWrap.jl or write C wrappers. Ensure stable ABI and types.

# double cos_c(double x) from libm
ccall((:cos, "libm"), Cdouble, (Cdouble,), 0.5)

32) PackageCompiler & Sysimages

Create a custom sysimage that precompiles your stack to reduce latency in CLIs/services.

using PackageCompiler
create_sysimage([:DataFrames, :CSV]; sysimage_path="AppSys.so")

Section 4 — Data, Statistics, ML & Visualization

33) DataFrames.jl Basics

Powerful tabular data with select, transform, groupby, and joins.

using DataFrames, CSV
df = DataFrame(a=1:5, b=rand(5))
combine(groupby(df, :a), :b => mean)

34) CSV.jl & Arrow.jl

Fast I/O: CSV for delimited files; Arrow for columnar in-memory + interop with other systems.

df = CSV.read("data.csv", DataFrame)
Arrow.write("data.arrow", df)

35) Querying & Pipe Macros

Use |> pipes or DataFramesMeta.jl for readable data transforms.

using DataFramesMeta
@chain df begin
  @transform(:c = :a .* :b)
  @subset(:c .> 0.5)
end

36) Stats & Distributions

Distributions.jl for probabilistic models; Statistics stdlib for means, variances, etc.

using Distributions, Statistics
x = rand(Normal(0,1), 10^5)
(mean(x), std(x))

37) GLM & Regression

GLM.jl fits linear/logistic/Poisson models; formula DSL similar to R.

using GLM
m = lm(@formula(y ~ x1 + x2), df)
coef(m)

38) MLJ.jl (Model Zoo)

Unified interface over many ML algorithms; pipelines, tuning, and evaluation.

using MLJ
X, y = rand(100,3), rand(0:1, 100)
Tree = @load DecisionTreeClassifier
mach = machine(Tree(), X, y) |> fit!
predict(mach, X[1:3,:])

39) Flux.jl (Deep Learning)

Pure-Julia DL with differentiable programming; supports GPU via CUDA.jl.

using Flux
m = Chain(Dense(10,32,relu), Dense(32,1))
loss(x,y) = Flux.Losses.mse(m(x), y)
opt = Descent(1e-3)

40) SciML & Differential Equations

DifferentialEquations.jl solves ODE/PDE/SDE; pairs with machine learning for universal differential equations.

using DifferentialEquations
f(u,p,t) = 1.01u
prob = ODEProblem(f, 1.0, (0.0, 1.0))
solve(prob)

41) Plotting: Plots.jl & Makie.jl

Plots.jl is backend-agnostic (GR, PyPlot, Plotly). Makie.jl for high-performance interactive viz.

using Plots
plot(0:0.1:10, sin, label="sin")

42) Notebooks & Pluto.jl

Use IJulia with Jupyter or Pluto.jl for reactive notebooks (great for teaching & reproducibility).

using Pluto; Pluto.run()

Section 5 — Building Packages, Deployment & Interview

43) Packaging Your Library

Generate package skeletons with PkgTemplates.jl (tests, CI, docs). Follow semantic versioning and write docstrings.

(@v1.x) pkg> add PkgTemplates
julia> using PkgTemplates
Template(; user="uplatz", plugins=[GitHubActions()])("MyPkg")

44) Documentation: Documenter.jl

Build nice docs from docstrings + Markdown; host on GitHub Pages; auto-deploy via CI.

(@v1.x) pkg> add Documenter
# make.jl builds docs; push to gh-pages

45) Configuration & Secrets

Use environment variables, TOML/JSON files; avoid globals by passing config structs. Keep secrets out of the repo.

using TOML
cfg = TOML.parsefile("config.toml")

46) Services & CLIs

Build CLIs with Comonicon.jl or ArgParse.jl; compile to a system image for fast startup; containerize with Docker.

# ArgParse example
using ArgParse
s = ArgParseSettings()
@add_arg_table s begin
  "--port" arg_type=Int default=8080
end

47) Data Engineering Pipelines

Combine CSV/Parquet/Arrow with DataFrames + Query transforms; schedule with Airflow or Dagger.jl; serialize with JLD2/Arrow.

using Parquet, DataFrames
df = DataFrame(rand(100,3), :auto)
Parquet.write("out.parquet", df)

48) Numerical Stability & Precision

Use BigFloat for high precision, Kahan summation for numeric stability, and tolerances for comparisons.

setprecision(256)
bigsum = sum(BigFloat.(rand(10^5)))

49) Common Pitfalls

Type instability (use @code_warntype), non-concrete containers, global scope performance, accidental allocations in hot loops, overusing macros where functions suffice.

@code_warntype sum_inplace!([1.0,2.0],[3.0,4.0])

50) Interview Q&A — 20 Practical Questions (Expanded)

1) Why Julia for numerical work? Multiple dispatch + LLVM JIT yield C-like speed with high-level syntax; rich ecosystem (SciML, Flux, DataFrames) supports end-to-end workflows.

2) Multiple dispatch vs OOP? Dispatch on all arg types (not just receiver). Encourages decoupled, composable APIs and specialization without inheritance hierarchies.

3) Abstract vs concrete types? Abstracts define interfaces and can’t be instantiated; concretes are instantiable and necessary for performant fields/arrays.

4) What is type stability? Return type predictable from input types. Stable code allows the compiler to optimize and avoid dynamic dispatch.

5) How to diagnose performance? Use BenchmarkTools, @code_warntype, Profile; look for allocations and unstable red highlights.

6) Broadcasting vs loops? Broadcasting is syntactic sugar for fused loops that avoid temporaries; performance similar to hand-written loops.

7) Why column-major matters? Access memory contiguously by columns for cache-friendly performance; prefer column-wise operations.

8) When to use views? For large slices to avoid copies; @view or view returns a lazy window over data.

9) Package environments? Project-local deps via Project/Manifest ensure reproducibility and isolation across apps.

10) Precompile vs sysimage? Precompile speeds package load; sysimage embeds compiled code into the Julia system image to cut startup latency.

11) Threads vs Distributed? Threads share memory (fine-grained parallel loops). Distributed uses multiple processes (even across machines) with message passing.

12) How to go GPU? Use CUDA.jl; move arrays to CuArray and rely on broadcast/BLAS; write kernels only when needed.

13) Interop strategy? Reuse Python (PyCall) or C libs (ccall) where mature stacks exist; gradually replace hot paths with Julia.

14) Handling missing data? Use Missing with Union types (Union{T,Missing}); DataFrames handle skipmissing etc.

15) Numerical stability tips? Use BigFloat when necessary, stable algorithms (Kahan), tolerances (isapprox) for comparisons.

16) Designing APIs? Generic methods on abstract interfaces; concrete specializations for hot cases; avoid overly deep module hierarchies.

17) Testing patterns? Use @test, @test_throws, property-based tests (Random), and compare against known invariants.

18) DataFrames vs SQL? DataFrames for in-memory analytics; pair with SQLite.jl/ODBC.jl for external data; push heavy ops to DB when needed.

19) Common perf anti-pattern? Non-concrete containers like Vector{Any}, using globals, creating temporaries in inner loops, not using @inbounds/@simd when safe.

20) Production deployment? PackageCompiler sysimage, CLI with ArgParse/Comonicon, Docker image with pinned Project/Manifest, observability (logging/metrics), and CI across Julia versions.