LLM Resources
LLM Quantization
-
from transformers import AutoModelForCausalLM model = AutoModelForCausalLM.from_pretrained("facebook/opt-350m", load_in_4bit=True, device_map="auto") -
Intro to 8-bit transformers multiplication

from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("facebook/opt-350m", load_in_4bit=True, device_map="auto")
Intro to 8-bit transformers multiplication
