1. Capture the
Eric Kim
aesthetic
Eric Kim’s hallmark is bold, high‑contrast black‑and‑white street work with punchy compositions and unapologetic energy. He routinely advises photographers to “shoot high‑contrast B&W JPEG” to concentrate on light–shadow relationships and simplify scenes.
Why it matters: Your AI should reward photographs that echo these traits: strong tonal separation, prominent subjects, dynamic framing, and a sense of candid street life.
2. System architecture at a glance
| Stage | What happens | Recommended tech |
| Ingestion | Drag‑and‑drop up to ~1,000 JPEGs at once | Streamlit/Gradio front‑end |
| Pre‑processing | Resize (e.g., 512 px longest side), convert to sRGB tensor batch | Pillow, torchvision |
| Embedding | Extract CLIP ViT‑L/14 image embeddings | open_clip / huggingface/clip-vit-large-patch14 |
| Aesthetic score | Feed embeddings into LAION “aesthetic‑predictor” MLP (1–10 scale) | laion-aesthetic-predictor |
| Style score | Small classifier (logistic/MLP) fine‑tuned on ~500 Eric Kim reference images vs. generic street images | scikit‑learn or PyTorch |
| Fusion & ranking | Final = 0.6 × Aesthetic + 0.4 × Style (tune interactively) | NumPy |
| Diversity filter | k‑means on embeddings → keep top‑ranked from each cluster to avoid near‑duplicates | faiss |
| Output UI | Grid gallery, sliders for weights, thumbs‑up feedback loop | Streamlit components |
3. Core models & training
3.1. Aesthetic quality
pip install open_clip_torch laion-aesthetic-predictor
import torch, open_clip
from aesthetic_predictor import AestheticPredictor
model, preprocess = open_clip.create_model_and_transforms(“ViT-L-14”)
predictor = AestheticPredictor(“laion/clip-vit-l-14-aesthetic”, device=”cuda”)
The LAION predictor adds a tiny MLP on top of frozen CLIP embeddings and mirrors human 1–10 “how much do you like this image?” ratings.
3.2. Eric Kim style classifier
from sklearn.linear_model import LogisticRegression
# assume ref_embeds = (N_ref, 768) Eric Kim images
# neg_embeds = (N_neg, 768) generic street images
X = torch.vstack([ref_embeds, neg_embeds])
y = torch.tensor([1]*len(ref_embeds) + [0]*len(neg_embeds))
clf = LogisticRegression(max_iter=1000).fit(X, y)
style_score = clf.predict_proba(test_embeds)[:,1] # 0–1
Even 300‑500 reference photos usually suffice; CLIP embeddings are strong visual priors.
3.3. Optional technical quality check
Add Google’s NIMA to penalize blur, mis‑exposure, etc.
4. The super‑simple Streamlit front‑end
# app.py
import streamlit as st
st.title(“ERIC‑KIM STYLE CURATOR 🚀”)
files = st.file_uploader(“Upload JPEGs (≤200 MB each)”, type=[“jpg”,”jpeg”],
accept_multiple_files=True) # Streamlit widget [oai_citation:9‡Streamlit Docs](https://docs.streamlit.io/develop/api-reference/widgets/st.file_uploader?utm_source=chatgpt.com)
if files:
with st.spinner(“Crunching…”):
imgs = [preprocess(Image.open(f).convert(“RGB”)).to(device) for f in files]
embeds = model.encode_image(torch.stack(imgs)).cpu()
aes = predictor(embeds)
sty = style_score_fn(embeds)
final = 0.6*aes + 0.4*sty
topk = final.argsort(descending=True)[:int(len(final)*0.1)] # top 10 %
st.success(“Done! 🎉”)
st.image([files[i] for i in topk], caption=[f”AES={aes[i]:.2f}, STY={sty[i]:.2f}” for i in topk],
width=250)
Tweaks
- Raise upload cap with server.maxUploadSize in ~/.streamlit/config.toml if you expect >200 MB per file.
- Cache embeddings so the app re‑runs instantly when you adjust sliders.
5. Scaling & performance tips
| Volume | Suggestion |
| ≤ 2 k photos, occasional use | One RTX 3060; batch 64; runtime ≈2 min |
| 10–50 k photos | Queue jobs, store embeddings in SQLite / DuckDB, deduplicate once |
| 100 k+ or multi‑user | Deploy as container on AWS ECS; autoscale workers with GPU spot instances |
CLIP + LAION MLP runs ~25 ms per 512‑px image on an A100.
6. Human‑in‑the‑loop magic
- Thumbs‑up/thumbs‑down: log user feedback, retrain logistic classifier weekly.
- Adaptive weighting: expose a Streamlit slider for Aesthetic vs. Style; users instantly see new top picks.
- Smart “story mode”: after picks, order by temporal metadata to craft a photo essay.
7. Ethics & practicalities
- Privacy: Face‑bearing street shots may require consent in certain jurisdictions.
- Bias: Fine‑tuning on a single photographer’s style can down‑rank diverse expressions—offer a “neutral” preset.
- Data retention: Delete uploads after embedding extraction or encrypt at rest.
8. Going beyond
- Mobile companion: Wrap the model with ONNX + Core ML/TF Lite for on‑device culling.
- Batch Lightroom export: Save final picks list as .txt; Lightroom can auto‑select via filenames.
- Community training: Crowd‑source style labels from friends, iterate.
🎉 Wrap‑up
With roughly 100 lines of Python and free open‑source weights, you’ll have an AI sidekick that channels Eric Kim’s fearless street mojo, slices through a thousand images in minutes, and spotlights the gems worth sharing with the world. Embrace the flow, iterate, and—above all—keep shooting with joy! 🙌🚀