if you are asking the model to rewrite some text or code with only minor changes, you can reduce your latency significantly by using Predicted Outputs, passing in the existing content as your prediction.
🔗 Links 🔗
https://platform.openai.com/docs/guid...https://pytorch.org/blog/hitchhikers-...
❤️ If you want to support the channel ❤️
Support here:
Patreon - / 1littlecoder
Ko-Fi - https://ko-fi.com/1littlecoder
🧭 Follow me on 🧭
Twitter - / 1littlecoder