Hot Take: The Deepseek model is having a little hype curve all its own.
Saturday, February 1, 2025
I have a keeping-organised tool that summarises stuff from email, calendars etc to get my day started nicely.
I swapped out gpt-4o for deepseek-r1-distill-qwen-7b running locally and it was… pretty disappointing. I had visions of magically-vanishing OpenAI invoices but alas, I think it’s not to be. Its responses are less useful, much less aligned to the prompt (as in, there’s stuff it just ignores) and with many more hallucinations. Some quite funny, so that’s nice.
Potential caveats:
- It’s running locally, not the official hosted version, which might be making a difference
- It could of course, as ever, be a problem with me and the implementation (the universally applicable potential explanation for AI mildly sucking)
- I did do a bit of prompt adjusting, but not that much.
TL;DR: Deepseek is interesting. If the $10m training cost claim stacks up, that’s pretty impressive. I’m sure there’ll be more to come. But I don’t think it’s knocked OpenAI off the top spot just yet.