Link Dump

This post is a dumping ground for links I found interesting or useful. Sometimes, I only glanced at them very quickly to add to my read-later.

For resources I genuinely recommend, see this list.

Articles/blogposts:

“Which GPU(s) to Get for Deep Learning: My Experience and Advice for Using GPUs in Deep Learning” by Tim Dettmers: https://timdettmers.com/2023/01/30/which-gpu-for-deep-learning/
“Fuck You, Show Me The Prompt.” by Hamel Husain: https://hamel.dev/blog/posts/prompt/
“Common statistical tests are linear models (or: how to teach stats)” by Jonas Kristoffer Lindeløv: https://lindeloev.github.io/tests-as-linear/
“Should You Run a Survey?” by Maddie Brown: https://www.nngroup.com/articles/should-you-run-a-survey/
“Understanding UMAP” by Andy Coenen and Adam Pearce: https://pair-code.github.io/understanding-umap/
“Deep Playground”: http://playground.tensorflow.org/
“AI safety is not a model property” by Arvind Narayanan and Sayash Kapoor: https://www.aisnakeoil.com/p/ai-safety-is-not-a-model-property
“Models All The Way Down” by Christo Buschek and Jer Thorp: https://knowingmachines.org/models-all-the-way (investigation of LAION-5B)
“Mental Models: The Best Way to Make Intelligent Decisions (~100 Models Explained)”: https://fs.blog/mental-models/ (see also: “Deliberate practice guide”)
“Nothing survives transcription, nothing doesn’t survive transcription” by Allison Parrish: https://posts.decontextualize.com/nothing-survives-transcription/
“AI in education is a public problem” by Ben Williamson: https://codeactsineducation.wordpress.com/2024/02/22/ai-in-education-is-a-public-problem/
“The Function Colour Myth. Or: async/await is not what you think it is.” by Cory Benfield: https://lukasa.co.uk/2016/07/The_Function_Colour_Myth/
“The Matrix Calculus You Need For Deep Learning” by Terence Parr and Jeremy Howard: https://explained.ai/matrix-calculus/
“Wada Sanzo, A Dictionary of Color Combinations”: https://sanzo-wada.dmbk.io/
Tools for forensic metascience by James Heathers: https://sites.google.com/view/jamesheathers/home
“Creating a LLM-as-a-Judge that drives business results” by Simon Willison and Hamel Husain: https://simonwillison.net/2024/Oct/30/llm-as-a-judge/
“Deep Learning Tuning Playbook” by Varun Godbole, George Dahl, Justin Gilmer, Christopher Shallue, Zachary Nado: https://github.com/google-research/tuning_playbook
“Free knowledge based on Creative Commons license: Consequences, risks and side-effects of the license module ‘non-commercial use only – NC’” by Paul Klimpel: https://meta.wikimedia.org/wiki/Free_knowledge_based_on_Creative_Commons_licenses/en

Papers:

“Pretrained Transformers for Text Ranking: BERT and Beyond” by Jimmy Lin, Rodrigo Nogueira, Andrew Yates: https://arxiv.org/abs/2010.06467
“Estimating logit models with small samples” by Carlisle Rainey and Kelly McCaskey: https://www.cambridge.org/core/journals/political-science-research-and-methods/article/estimating-logit-models-with-small-samples/712F6039372475B650B123E9A1FE60E6
“What are the most important statistical ideas of the past 50 years?” by Andrew Gelman and Aki Vehtari: https://arxiv.org/abs/2012.00174
“Combining Doubly Robust Methods and Machine Learning for Estimating Average Treatment Effects for Observational Real-world Data” by Xiaoqing Tan, Shu Yang, Wenyu Ye, Douglas Faries, Ilya Lipkovich, Zbigniew Kadziola: https://arxiv.org/abs/2204.10969
“They Don’t Read Very Well: A Study of the Reading Comprehension Skills of English Majors at Two Midwestern Universities”: https://muse.jhu.edu/pub/1/article/922346 💬 This goes viral every now and then. Features an interesting method exploring reading comprehension by having students talk aloud to a proctor as they read the first few paragraphs of Bleak House.

Books:

“Deep Learning: Foundations and Concepts” by Christopher Bishop and Hugh Bishop: https://www.bishopbook.com/
“Engineering a Safer World: Systems Thinking Applied to Safety” by Nancy G. Leveson: https://direct.mit.edu/books/oa-monograph/2908/Engineering-a-Safer-WorldSystems-Thinking-Applied
“Fairness and Machine Learning: Limitations and Opportunities” by Solon Barocas, Moritz Hardt, and Arvind Narayanan: https://fairmlbook.org/
“Active Statistics” by Andrew Gelman and Aki Vehtari: https://avehtari.github.io/ActiveStatistics/
“Improving Your Statistical Inferences” by Daniël Lakens: https://lakens.github.io/statistical_inferences/
“Teaching Accessible Computing” by Alannah Oleson, Amy J. Ko, and Richard Ladner: https://bookish.press/tac
“Algorithms for Modern Hardware” by Sergey Slotin: https://en.algorithmica.org/hpc/
“The Math Academy Way” (as well as “Introduction to Algorithms and Machine Learning: from Sorting to Strategic Agents” and linear algebra/calculus books) by Justin Skycak: https://www.justinmath.com/books/
“A User’s Guide to Statistical Inference and Regression” by Matthew Blackwell: https://mattblackwell.github.io/gov2002-book/
“Kalman and Bayesian Filters in Python” by Roger Labbe: https://github.com/rlabbe/Kalman-and-Bayesian-Filters-in-Python
“The brms Book: Applied Bayesian Regression Modelling Using R and Stan” by Paul Bürkner: http://paulbuerkner.com/software/brms-book/
“Test-Driven Web Development with Python” by Harry Percival: https://www.obeythetestinggoat.com/pages/book.html
“Unraveling principal component analysis” by Peter Bloem: https://peterbloem.nl/publications/unraveling-pca

Datasets/resources:

NIH collection of free vector graphics for presentations: https://bioart.niaid.nih.gov/