Learned a lot about #airflow at scale over the last few weeks doing a POC for #CommonCrawlFoundation - some of it is not super well documented, so I wrote up my learnings: https://www.jason-grey.com/posts/2025/airflow-at-scale/
14.3.2025 19:57Learned a lot about #airflow at scale over the last few weeks doing a POC for #CommonCrawlFoundation - some of it is not super well...29 nodes up to 90km from Fenway Park neighborhood in Boston @benbrown (and I haven't really even moved yet...)
13.8.2024 19:0329 nodes up to 90km from Fenway Park neighborhood in Boston @benbrown (and I haven't really even moved yet...)interesting timing: this newly published article in CACM ties in nicely with the suggestion I made at the end of the conference ("pairing the policy makers and lawyers with data scientists or software engineers")
The authors (Nicholas Berente, Cameron Kormylo, and Christoph Rosenkranz) propose combination of discourse ethics and test-driven development as a framework.
https://cacm.acm.org/opinion/test-driven-ethics-for-machine-learning/
2.5.2024 18:47@jeffjarvis interesting timing: this newly published article in CACM ties in nicely with the suggestion I made at the end of the conference...Fun experiment I just finished - Common Crawl Checker - will return yes/no if a domain name is in the latest common crawl dataset. (Common Crawl is one of the datasets many LLMs use for training)
https://www.jason-grey.com/posts/2024/common-crawl-checker/
it's free to use, and code is also provided... have fun!
6.2.2024 23:51Fun experiment I just finished - Common Crawl Checker - will return yes/no if a domain name is in the latest common crawl dataset. (Common...ML the ML - new little blog post for intermediate data science folks
https://www.jason-grey.com/posts/2024/ml-the-ml/
18.1.2024 20:44ML the ML - new little blog post for intermediate data science folkshttps://www.jason-grey.com/posts/2024/ml-the-ml/Just arrived... from 2021.
24.11.2023 19:03Just arrived... from 2021.I've been using Linux for 20+ years and resizing/moving partitions always makes me nervous. Never had an issue, mind you... but, it's still nerve racking. So, here's a haiku to calm you/me down...
In shadowed bytes' depth,
GParted looms, fear takes breath,
Yet, it mends, brings health.
In preparation for my talk today on #nlp #llm and #machinelearning, I took the time to write down some thoughts on how I think NLP/LLM apps should progress to make them safer for real business use.
Comments/thoughts/corrections appreciated:
https://www.jason-grey.com/posts/2023/nlp-maturity-model/
And a link for the talk:
https://www.warecorp.com/event/ai-for-natural-language-processing-1/register
I will be speaking next week:
AI for Natural Language Processing
Learn how AI-NLP techniques enable personalized user experiences, improve customer interactions, and drive higher engagement rates. From sentiment analysis to intelligent chatbots, we will showcase how AI-NLP powers efficient and engaging customer support. We will discuss the importance and implementation of guardrails in your model to ensure the returned results are accurate and on-topic.
Registration:
https://www.warecorp.com/event/ai-for-natural-language-processing-1
I see you rust... i see you...
https://github.com/huggingface/candle
A few snaps from my recent off-road motorcycle trip in Montana and Idaho
Made it out alive, with only a bent brake pedal.
Tuning those hyperparameters tonight... 😂
https://www.jason-grey.com/posts/2023/hyperparameter-tuning/
18.5.2023 02:10Tuning those hyperparameters tonight... 😂 https://www.jason-grey.com/posts/2023/hyperparameter-tuning/Starting up a blog again, and hopefully remembering to toot about it. This time - a short review of a video about why "Migrations are the Hardest Problem in Computer Science" by Matt Ranney:
https://www.jason-grey.com/posts/2023/migrations-are-hard/
Original video: https://www.youtube.com/watch?v=yJOrMDMqeoI
10.5.2023 02:34Starting up a blog again, and hopefully remembering to toot about it. This time - a short review of a video about why "Migrations are...⬆️
⬇️