GritLM-7B 7B parameter model that uses bidirectional attention for embedding and causal attention for generation. It is finetuned from Mistral-7B 66.8 55.5 GritLM-8x7B 8x7B parameter model that uses ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results