This is a throwaway first post so the site has something to render while I get set up.
Real writing about inference infrastructure, KV cache eviction, and batching is on the way. Check back soon.
This is a throwaway first post so the site has something to render while I get set up.
Real writing about inference infrastructure, KV cache eviction, and batching is on the way. Check back soon.