Transformers documentation
Auto Classes
Auto Classes
In many cases, the architecture you want to use can be guessed from the name or the path of the pretrained model you
are supplying to the from_pretrained() method. AutoClasses are here to do this job for you so that you
automatically retrieve the relevant model given the name/path to the pretrained weights/config/vocabulary.
Instantiating one of AutoConfig, AutoModel, and AutoTokenizer will directly create a class of the relevant architecture. For instance
model = AutoModel.from_pretrained("google-bert/bert-base-cased")will create a model that is an instance of BertModel.
There is one class of AutoModel for each task.
Extending the Auto Classes
Each of the auto classes has a method to be extended with your custom classes. For instance, if you have defined a
custom class of model NewModel, make sure you have a NewModelConfig then you can add those to the auto
classes like this:
from transformers import AutoConfig, AutoModel
AutoConfig.register("new-model", NewModelConfig)
AutoModel.register(NewModelConfig, NewModel)You will then be able to use the auto classes like you would usually do!
If your
NewModelConfigis a subclass of PreTrainedConfig, make sure itsmodel_typeattribute is set to the same key you use when registering the config (here"new-model").Likewise, if your
NewModelis a subclass of PreTrainedModel, make sure itsconfig_classattribute is set to the same class you use when registering the model (hereNewModelConfig).
AutoConfig
This is a generic configuration class that will be instantiated as one of the configuration classes of the library when created with the from_pretrained() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_pretrained
< source >( pretrained_model_name_or_path: str | os.PathLike[str] **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model configuration hosted inside a model repo on huggingface.co.
- A path to a directory containing a configuration file saved using the
save_pretrained() method, or the save_pretrained() method,
e.g.,
./my_model_directory/. - a path to a saved configuration JSON file, e.g.,
./my_model_directory/configuration.json.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download the model weights and configuration files and override the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - return_unused_kwargs (
bool, optional, defaults toFalse) — IfFalse, then this function returns just the final configuration object.If
True, then this functions returns aTuple(config, unused_kwargs)where unused_kwargs is a dictionary consisting of the key/value pairs whose keys are not configuration attributes: i.e., the part ofkwargswhich has not been used to updateconfigand is otherwise ignored. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - kwargs(additional keyword arguments, optional) —
The values in kwargs of any keys which are configuration attributes will be used to override the loaded
values. Behavior concerning key/value pairs whose keys are not configuration attributes is controlled
by the
return_unused_kwargskeyword parameter.
Instantiate one of the configuration classes of the library from a pretrained model configuration.
The configuration class to instantiate is selected based on the model_type property of the config object that
is loaded, or when it’s missing, by falling back to using pattern matching on pretrained_model_name_or_path:
- EvollaModel — EvollaConfig (EvollaConfig model)
- afmoe — AfmoeConfig (AfmoeConfig model)
- aimv2 — Aimv2Config (Aimv2Config model)
- aimv2_text_model — Aimv2TextConfig (Aimv2TextConfig model)
- aimv2_vision_model — Aimv2VisionConfig (Aimv2VisionConfig model)
- albert — AlbertConfig (AlbertConfig model)
- align — AlignConfig (AlignConfig model)
- align_text_model — AlignTextConfig (AlignTextConfig model)
- align_vision_model — AlignVisionConfig (AlignVisionConfig model)
- altclip — AltCLIPConfig (AltCLIPConfig model)
- altclip_text_model — AltCLIPTextConfig (AltCLIPTextConfig model)
- altclip_vision_model — AltCLIPVisionConfig (AltCLIPVisionConfig model)
- apertus — ApertusConfig (ApertusConfig model)
- arcee — ArceeConfig (ArceeConfig model)
- aria — AriaConfig (AriaConfig model)
- aria_text — AriaTextConfig (AriaTextConfig model)
- audio-spectrogram-transformer — ASTConfig (ASTConfig model)
- audioflamingo3 — AudioFlamingo3Config (AudioFlamingo3Config model)
- audioflamingo3_encoder — AudioFlamingo3EncoderConfig (AudioFlamingo3EncoderConfig model)
- autoformer — AutoformerConfig (AutoformerConfig model)
- aya_vision — AyaVisionConfig (AyaVisionConfig model)
- bamba — BambaConfig (BambaConfig model)
- bark — BarkConfig (BarkConfig model)
- bart — BartConfig (BartConfig model)
- beit — BeitConfig (BeitConfig model)
- bert — BertConfig (BertConfig model)
- bert-generation — BertGenerationConfig (BertGenerationConfig model)
- big_bird — BigBirdConfig (BigBirdConfig model)
- bigbird_pegasus — BigBirdPegasusConfig (BigBirdPegasusConfig model)
- biogpt — BioGptConfig (BioGptConfig model)
- bit — BitConfig (BitConfig model)
- bitnet — BitNetConfig (BitNetConfig model)
- blenderbot — BlenderbotConfig (BlenderbotConfig model)
- blenderbot-small — BlenderbotSmallConfig (BlenderbotSmallConfig model)
- blip — BlipConfig (BlipConfig model)
- blip-2 — Blip2Config (Blip2Config model)
- blip_2_qformer — Blip2QFormerConfig (Blip2QFormerConfig model)
- blip_2_vision_model — Blip2VisionConfig (Blip2VisionConfig model)
- blip_text_model — BlipTextConfig (BlipTextConfig model)
- blip_vision_model — BlipVisionConfig (BlipVisionConfig model)
- bloom — BloomConfig (BloomConfig model)
- blt — BltConfig (BltConfig model)
- blt_global_transformer —
BltGlobalTransformerConfig(BltGlobalTransformerConfig model) - blt_local_decoder —
BltLocalDecoderConfig(BltLocalDecoderConfig model) - blt_local_encoder —
BltLocalEncoderConfig(BltLocalEncoderConfig model) - blt_patcher —
BltPatcherConfig(BltPatcherConfig model) - bridgetower — BridgeTowerConfig (BridgeTowerConfig model)
- bridgetower_text_model — BridgeTowerTextConfig (BridgeTowerTextConfig model)
- bridgetower_vision_model — BridgeTowerVisionConfig (BridgeTowerVisionConfig model)
- bros — BrosConfig (BrosConfig model)
- camembert — CamembertConfig (CamembertConfig model)
- canine — CanineConfig (CanineConfig model)
- chameleon — ChameleonConfig (ChameleonConfig model)
- chameleon_vqgan — ChameleonVQVAEConfig (ChameleonVQVAEConfig model)
- chinese_clip — ChineseCLIPConfig (ChineseCLIPConfig model)
- chinese_clip_text_model — ChineseCLIPTextConfig (ChineseCLIPTextConfig model)
- chinese_clip_vision_model — ChineseCLIPVisionConfig (ChineseCLIPVisionConfig model)
- chmv2 — CHMv2Config (CHMv2Config model)
- clap — ClapConfig (ClapConfig model)
- clap_audio_model — ClapAudioConfig (ClapAudioConfig model)
- clap_text_model — ClapTextConfig (ClapTextConfig model)
- clip — CLIPConfig (CLIPConfig model)
- clip_text_model — CLIPTextConfig (CLIPTextConfig model)
- clip_vision_model — CLIPVisionConfig (CLIPVisionConfig model)
- clipseg — CLIPSegConfig (CLIPSegConfig model)
- clipseg_text_model — CLIPSegTextConfig (CLIPSegTextConfig model)
- clipseg_vision_model — CLIPSegVisionConfig (CLIPSegVisionConfig model)
- clvp — ClvpConfig (ClvpConfig model)
- clvp_decoder — ClvpDecoderConfig (ClvpDecoderConfig model)
- clvp_encoder — ClvpEncoderConfig (ClvpEncoderConfig model)
- codegen — CodeGenConfig (CodeGenConfig model)
- cohere — CohereConfig (CohereConfig model)
- cohere2 — Cohere2Config (Cohere2Config model)
- cohere2_vision — Cohere2VisionConfig (Cohere2VisionConfig model)
- cohere_asr — CohereAsrConfig (CohereAsrConfig model)
- colmodernvbert — ColModernVBertConfig (ColModernVBertConfig model)
- colpali — ColPaliConfig (ColPaliConfig model)
- colqwen2 — ColQwen2Config (ColQwen2Config model)
- conditional_detr — ConditionalDetrConfig (ConditionalDetrConfig model)
- convbert — ConvBertConfig (ConvBertConfig model)
- convnext — ConvNextConfig (ConvNextConfig model)
- convnextv2 — ConvNextV2Config (ConvNextV2Config model)
- cpmant — CpmAntConfig (CpmAntConfig model)
- csm — CsmConfig (CsmConfig model)
- csm_depth_decoder_model — CsmDepthDecoderConfig (CsmDepthDecoderConfig model)
- ctrl — CTRLConfig (CTRLConfig model)
- cvt — CvtConfig (CvtConfig model)
- cwm — CwmConfig (CwmConfig model)
- d_fine — DFineConfig (DFineConfig model)
- dab-detr — DabDetrConfig (DabDetrConfig model)
- dac — DacConfig (DacConfig model)
- data2vec-audio — Data2VecAudioConfig (Data2VecAudioConfig model)
- data2vec-text — Data2VecTextConfig (Data2VecTextConfig model)
- data2vec-vision — Data2VecVisionConfig (Data2VecVisionConfig model)
- dbrx — DbrxConfig (DbrxConfig model)
- deberta — DebertaConfig (DebertaConfig model)
- deberta-v2 — DebertaV2Config (DebertaV2Config model)
- decision_transformer — DecisionTransformerConfig (DecisionTransformerConfig model)
- deepseek_v2 — DeepseekV2Config (DeepseekV2Config model)
- deepseek_v3 — DeepseekV3Config (DeepseekV3Config model)
- deepseek_vl — DeepseekVLConfig (DeepseekVLConfig model)
- deepseek_vl_hybrid — DeepseekVLHybridConfig (DeepseekVLHybridConfig model)
- deformable_detr — DeformableDetrConfig (DeformableDetrConfig model)
- deit — DeiTConfig (DeiTConfig model)
- depth_anything — DepthAnythingConfig (DepthAnythingConfig model)
- depth_pro — DepthProConfig (DepthProConfig model)
- detr — DetrConfig (DetrConfig model)
- dia — DiaConfig (DiaConfig model)
- dia_decoder — DiaDecoderConfig (DiaDecoderConfig model)
- dia_encoder — DiaEncoderConfig (DiaEncoderConfig model)
- diffllama — DiffLlamaConfig (DiffLlamaConfig model)
- dinat — DinatConfig (DinatConfig model)
- dinov2 — Dinov2Config (Dinov2Config model)
- dinov2_with_registers — Dinov2WithRegistersConfig (Dinov2WithRegistersConfig model)
- dinov3_convnext — DINOv3ConvNextConfig (DINOv3ConvNextConfig model)
- dinov3_vit — DINOv3ViTConfig (DINOv3ViTConfig model)
- distilbert — DistilBertConfig (DistilBertConfig model)
- doge — DogeConfig (DogeConfig model)
- donut-swin — DonutSwinConfig (DonutSwinConfig model)
- dots1 — Dots1Config (Dots1Config model)
- dpr — DPRConfig (DPRConfig model)
- dpt — DPTConfig (DPTConfig model)
- edgetam — EdgeTamConfig (EdgeTamConfig model)
- edgetam_video — EdgeTamVideoConfig (EdgeTamVideoConfig model)
- edgetam_vision_model — EdgeTamVisionConfig (EdgeTamVisionConfig model)
- efficientloftr — EfficientLoFTRConfig (EfficientLoFTRConfig model)
- efficientnet — EfficientNetConfig (EfficientNetConfig model)
- electra — ElectraConfig (ElectraConfig model)
- emu3 — Emu3Config (Emu3Config model)
- emu3_text_model — Emu3TextConfig (Emu3TextConfig model)
- emu3_vqgan — Emu3VQVAEConfig (Emu3VQVAEConfig model)
- encodec — EncodecConfig (EncodecConfig model)
- encoder-decoder — EncoderDecoderConfig (EncoderDecoderConfig model)
- eomt — EomtConfig (EomtConfig model)
- eomt_dinov3 — EomtDinov3Config (EomtDinov3Config model)
- ernie — ErnieConfig (ErnieConfig model)
- ernie4_5 — Ernie4_5Config (Ernie4_5Config model)
- ernie4_5_moe — Ernie4_5_MoeConfig (Ernie4_5_MoeConfig model)
- ernie4_5_vl_moe — Ernie4_5_VLMoeConfig (Ernie4_5_VLMoeConfig model)
- ernie4_5_vl_moe_text — Ernie4_5_VLMoeTextConfig (Ernie4_5_VLMoeTextConfig model)
- ernie4_5_vl_moe_vision — Ernie4_5_VLMoeVisionConfig (Ernie4_5_VLMoeVisionConfig model)
- esm — EsmConfig (EsmConfig model)
- eurobert — EuroBertConfig (EuroBertConfig model)
- evolla — EvollaConfig (EvollaConfig model)
- exaone4 — Exaone4Config (Exaone4Config model)
- exaone_moe — ExaoneMoeConfig (ExaoneMoeConfig model)
- falcon — FalconConfig (FalconConfig model)
- falcon_h1 — FalconH1Config (FalconH1Config model)
- falcon_mamba — FalconMambaConfig (FalconMambaConfig model)
- fast_vlm — FastVlmConfig (FastVlmConfig model)
- fastspeech2_conformer — FastSpeech2ConformerConfig (FastSpeech2ConformerConfig model)
- fastspeech2_conformer_hifigan — FastSpeech2ConformerHifiGanConfig (FastSpeech2ConformerHifiGanConfig model)
- fastspeech2_conformer_with_hifigan — FastSpeech2ConformerWithHifiGanConfig (FastSpeech2ConformerWithHifiGanConfig model)
- flaubert — FlaubertConfig (FlaubertConfig model)
- flava — FlavaConfig (FlavaConfig model)
- flava_image_model — FlavaImageConfig (FlavaImageConfig model)
- flava_multimodal_model — FlavaMultimodalConfig (FlavaMultimodalConfig model)
- flava_text_model — FlavaTextConfig (FlavaTextConfig model)
- flex_olmo — FlexOlmoConfig (FlexOlmoConfig model)
- florence2 — Florence2Config (Florence2Config model)
- florence_vision — Florence2VisionConfig (Florence2VisionConfig model)
- fnet — FNetConfig (FNetConfig model)
- focalnet — FocalNetConfig (FocalNetConfig model)
- fsmt — FSMTConfig (FSMTConfig model)
- funnel — FunnelConfig (FunnelConfig model)
- fuyu — FuyuConfig (FuyuConfig model)
- gemma — GemmaConfig (GemmaConfig model)
- gemma2 — Gemma2Config (Gemma2Config model)
- gemma3 — Gemma3Config (Gemma3Config model)
- gemma3_text — Gemma3TextConfig (Gemma3TextConfig model)
- gemma3n — Gemma3nConfig (Gemma3nConfig model)
- gemma3n_audio — Gemma3nAudioConfig (Gemma3nAudioConfig model)
- gemma3n_text — Gemma3nTextConfig (Gemma3nTextConfig model)
- gemma3n_vision — Gemma3nVisionConfig (Gemma3nVisionConfig model)
- gemma4 — Gemma4Config (Gemma4Config model)
- gemma4_audio — Gemma4AudioConfig (Gemma4AudioConfig model)
- gemma4_text — Gemma4TextConfig (Gemma4TextConfig model)
- gemma4_vision — Gemma4VisionConfig (Gemma4VisionConfig model)
- git — GitConfig (GitConfig model)
- git_vision_model — GitVisionConfig (GitVisionConfig model)
- glm — GlmConfig (GlmConfig model)
- glm4 — Glm4Config (Glm4Config model)
- glm46v — Glm46VConfig (Glm46VConfig model)
- glm4_moe — Glm4MoeConfig (Glm4MoeConfig model)
- glm4_moe_lite — Glm4MoeLiteConfig (Glm4MoeLiteConfig model)
- glm4v — Glm4vConfig (Glm4vConfig model)
- glm4v_moe — Glm4vMoeConfig (Glm4vMoeConfig model)
- glm4v_moe_text — Glm4vMoeTextConfig (Glm4vMoeTextConfig model)
- glm4v_moe_vision — Glm4vMoeVisionConfig (Glm4vMoeVisionConfig model)
- glm4v_text — Glm4vTextConfig (Glm4vTextConfig model)
- glm4v_vision — Glm4vVisionConfig (Glm4vVisionConfig model)
- glm_image — GlmImageConfig (GlmImageConfig model)
- glm_image_text — GlmImageTextConfig (GlmImageTextConfig model)
- glm_image_vision — GlmImageVisionConfig (GlmImageVisionConfig model)
- glm_image_vqmodel — GlmImageVQVAEConfig (GlmImageVQVAEConfig model)
- glm_moe_dsa — GlmMoeDsaConfig (GlmMoeDsaConfig model)
- glm_ocr — GlmOcrConfig (GlmOcrConfig model)
- glm_ocr_text — GlmOcrTextConfig (GlmOcrTextConfig model)
- glm_ocr_vision — GlmOcrVisionConfig (GlmOcrVisionConfig model)
- glmasr — GlmAsrConfig (GlmAsrConfig model)
- glmasr_encoder — GlmAsrEncoderConfig (GlmAsrEncoderConfig model)
- glpn — GLPNConfig (GLPNConfig model)
- got_ocr2 — GotOcr2Config (GotOcr2Config model)
- gpt-sw3 — GPT2Config (GPT2Config model)
- gpt2 — GPT2Config (GPT2Config model)
- gpt_bigcode — GPTBigCodeConfig (GPTBigCodeConfig model)
- gpt_neo — GPTNeoConfig (GPTNeoConfig model)
- gpt_neox — GPTNeoXConfig (GPTNeoXConfig model)
- gpt_neox_japanese — GPTNeoXJapaneseConfig (GPTNeoXJapaneseConfig model)
- gpt_oss — GptOssConfig (GptOssConfig model)
- gptj — GPTJConfig (GPTJConfig model)
- granite — GraniteConfig (GraniteConfig model)
- granite_speech — GraniteSpeechConfig (GraniteSpeechConfig model)
- granite_speech_encoder — GraniteSpeechEncoderConfig (GraniteSpeechEncoderConfig model)
- granitemoe — GraniteMoeConfig (GraniteMoeConfig model)
- granitemoehybrid — GraniteMoeHybridConfig (GraniteMoeHybridConfig model)
- granitemoeshared — GraniteMoeSharedConfig (GraniteMoeSharedConfig model)
- grounding-dino — GroundingDinoConfig (GroundingDinoConfig model)
- groupvit — GroupViTConfig (GroupViTConfig model)
- groupvit_text_model — GroupViTTextConfig (GroupViTTextConfig model)
- groupvit_vision_model — GroupViTVisionConfig (GroupViTVisionConfig model)
- helium — HeliumConfig (HeliumConfig model)
- hgnet_v2 — HGNetV2Config (HGNetV2Config model)
- hiera — HieraConfig (HieraConfig model)
- higgs_audio_v2 — HiggsAudioV2Config (HiggsAudioV2Config model)
- higgs_audio_v2_tokenizer — HiggsAudioV2TokenizerConfig (HiggsAudioV2TokenizerConfig model)
- hubert — HubertConfig (HubertConfig model)
- hunyuan_v1_dense — HunYuanDenseV1Config (HunYuanDenseV1Config model)
- hunyuan_v1_moe — HunYuanMoEV1Config (HunYuanMoEV1Config model)
- ibert — IBertConfig (IBertConfig model)
- idefics — IdeficsConfig (IdeficsConfig model)
- idefics2 — Idefics2Config (Idefics2Config model)
- idefics2_perceiver — Idefics2PerceiverConfig (Idefics2PerceiverConfig model)
- idefics2_vision — Idefics2VisionConfig (Idefics2VisionConfig model)
- idefics3 — Idefics3Config (Idefics3Config model)
- idefics3_vision — Idefics3VisionConfig (Idefics3VisionConfig model)
- idefics_perciever — IdeficsPerceiverConfig (IdeficsPerceiverConfig model)
- idefics_vision — IdeficsVisionConfig (IdeficsVisionConfig model)
- ijepa — IJepaConfig (IJepaConfig model)
- imagegpt — ImageGPTConfig (ImageGPTConfig model)
- informer — InformerConfig (InformerConfig model)
- instructblip — InstructBlipConfig (InstructBlipConfig model)
- instructblip_qformer — InstructBlipQFormerConfig (InstructBlipQFormerConfig model)
- instructblip_vision_model — InstructBlipVisionConfig (InstructBlipVisionConfig model)
- instructblipvideo — InstructBlipVideoConfig (InstructBlipVideoConfig model)
- instructblipvideo_qformer — InstructBlipVideoQFormerConfig (InstructBlipVideoQFormerConfig model)
- instructblipvideo_vision_model — InstructBlipVideoVisionConfig (InstructBlipVideoVisionConfig model)
- internvl — InternVLConfig (InternVLConfig model)
- internvl_vision — InternVLVisionConfig (InternVLVisionConfig model)
- jais2 — Jais2Config (Jais2Config model)
- jamba — JambaConfig (JambaConfig model)
- janus — JanusConfig (JanusConfig model)
- janus_vision_model — JanusVisionConfig (JanusVisionConfig model)
- janus_vqgan — JanusVQVAEConfig (JanusVQVAEConfig model)
- jetmoe — JetMoeConfig (JetMoeConfig model)
- jina_embeddings_v3 — JinaEmbeddingsV3Config (JinaEmbeddingsV3Config model)
- kosmos-2 — Kosmos2Config (Kosmos2Config model)
- kosmos-2.5 — Kosmos2_5Config (Kosmos2_5Config model)
- kosmos_2_5_text_model — Kosmos2_5TextConfig (Kosmos2_5TextConfig model)
- kosmos_2_5_vision_model — Kosmos2_5VisionConfig (Kosmos2_5VisionConfig model)
- kosmos_2_text_model — Kosmos2TextConfig (Kosmos2TextConfig model)
- kosmos_2_vision_model — Kosmos2VisionConfig (Kosmos2VisionConfig model)
- kyutai_speech_to_text — KyutaiSpeechToTextConfig (KyutaiSpeechToTextConfig model)
- lasr_ctc — LasrCTCConfig (LasrCTCConfig model)
- lasr_encoder — LasrEncoderConfig (LasrEncoderConfig model)
- layoutlm — LayoutLMConfig (LayoutLMConfig model)
- layoutlmv2 — LayoutLMv2Config (LayoutLMv2Config model)
- layoutlmv3 — LayoutLMv3Config (LayoutLMv3Config model)
- layoutxlm — LayoutXLMConfig (LayoutXLMConfig model)
- led — LEDConfig (LEDConfig model)
- levit — LevitConfig (LevitConfig model)
- lfm2 — Lfm2Config (Lfm2Config model)
- lfm2_moe — Lfm2MoeConfig (Lfm2MoeConfig model)
- lfm2_vl — Lfm2VlConfig (Lfm2VlConfig model)
- lightglue — LightGlueConfig (LightGlueConfig model)
- lighton_ocr — LightOnOcrConfig (LightOnOcrConfig model)
- lilt — LiltConfig (LiltConfig model)
- llama — LlamaConfig (LlamaConfig model)
- llama4 — Llama4Config (Llama4Config model)
- llama4_text — Llama4TextConfig (Llama4TextConfig model)
- llama4_vision_model — Llama4VisionConfig (Llama4VisionConfig model)
- llava — LlavaConfig (LlavaConfig model)
- llava_next — LlavaNextConfig (LlavaNextConfig model)
- llava_next_video — LlavaNextVideoConfig (LlavaNextVideoConfig model)
- llava_onevision — LlavaOnevisionConfig (LlavaOnevisionConfig model)
- longcat_flash — LongcatFlashConfig (LongcatFlashConfig model)
- longformer — LongformerConfig (LongformerConfig model)
- longt5 — LongT5Config (LongT5Config model)
- luke — LukeConfig (LukeConfig model)
- lw_detr — LwDetrConfig (LwDetrConfig model)
- lw_detr_vit — LwDetrViTConfig (LwDetrViTConfig model)
- lxmert — LxmertConfig (LxmertConfig model)
- m2m_100 — M2M100Config (M2M100Config model)
- mamba — MambaConfig (MambaConfig model)
- mamba2 — Mamba2Config (Mamba2Config model)
- marian — MarianConfig (MarianConfig model)
- markuplm — MarkupLMConfig (MarkupLMConfig model)
- mask2former — Mask2FormerConfig (Mask2FormerConfig model)
- maskformer — MaskFormerConfig (MaskFormerConfig model)
- maskformer-swin —
MaskFormerSwinConfig(MaskFormerSwinConfig model) - mbart — MBartConfig (MBartConfig model)
- megatron-bert — MegatronBertConfig (MegatronBertConfig model)
- metaclip_2 — MetaClip2Config (MetaClip2Config model)
- metaclip_2_text_model — MetaClip2TextConfig (MetaClip2TextConfig model)
- metaclip_2_vision_model — MetaClip2VisionConfig (MetaClip2VisionConfig model)
- mgp-str — MgpstrConfig (MgpstrConfig model)
- mimi — MimiConfig (MimiConfig model)
- minimax — MiniMaxConfig (MiniMaxConfig model)
- minimax_m2 — MiniMaxM2Config (MiniMaxM2Config model)
- ministral — MinistralConfig (MinistralConfig model)
- ministral3 — Ministral3Config (Ministral3Config model)
- mistral — MistralConfig (MistralConfig model)
- mistral3 — Mistral3Config (Mistral3Config model)
- mistral4 — Mistral4Config (Mistral4Config model)
- mixtral — MixtralConfig (MixtralConfig model)
- mlcd — MLCDVisionConfig (MLCDVisionConfig model)
- mlcd_vision_model — MLCDVisionConfig (MLCDVisionConfig model)
- mllama — MllamaConfig (MllamaConfig model)
- mllama_text_model — MllamaTextConfig (MllamaTextConfig model)
- mllama_vision_model — MllamaVisionConfig (MllamaVisionConfig model)
- mm-grounding-dino — MMGroundingDinoConfig (MMGroundingDinoConfig model)
- mobilebert — MobileBertConfig (MobileBertConfig model)
- mobilenet_v1 — MobileNetV1Config (MobileNetV1Config model)
- mobilenet_v2 — MobileNetV2Config (MobileNetV2Config model)
- mobilevit — MobileViTConfig (MobileViTConfig model)
- mobilevitv2 — MobileViTV2Config (MobileViTV2Config model)
- modernbert — ModernBertConfig (ModernBertConfig model)
- modernbert-decoder — ModernBertDecoderConfig (ModernBertDecoderConfig model)
- modernvbert — ModernVBertConfig (ModernVBertConfig model)
- moonshine — MoonshineConfig (MoonshineConfig model)
- moonshine_streaming — MoonshineStreamingConfig (MoonshineStreamingConfig model)
- moonshine_streaming_encoder — MoonshineStreamingEncoderConfig (MoonshineStreamingEncoderConfig model)
- moshi — MoshiConfig (MoshiConfig model)
- moshi_depth — MoshiDepthConfig (MoshiDepthConfig model)
- mpnet — MPNetConfig (MPNetConfig model)
- mpt — MptConfig (MptConfig model)
- mra — MraConfig (MraConfig model)
- mt5 — MT5Config (MT5Config model)
- musicflamingo — MusicFlamingoConfig (MusicFlamingoConfig model)
- musicgen — MusicgenConfig (MusicgenConfig model)
- musicgen_decoder — MusicgenDecoderConfig (MusicgenDecoderConfig model)
- musicgen_melody — MusicgenMelodyConfig (MusicgenMelodyConfig model)
- musicgen_melody_decoder — MusicgenMelodyDecoderConfig (MusicgenMelodyDecoderConfig model)
- mvp — MvpConfig (MvpConfig model)
- nanochat — NanoChatConfig (NanoChatConfig model)
- nemotron — NemotronConfig (NemotronConfig model)
- nemotron_h — NemotronHConfig (NemotronHConfig model)
- nllb-moe — NllbMoeConfig (NllbMoeConfig model)
- nomic_bert — NomicBertConfig (NomicBertConfig model)
- nougat — NougatConfig (NougatConfig model)
- nystromformer — NystromformerConfig (NystromformerConfig model)
- olmo — OlmoConfig (OlmoConfig model)
- olmo2 — Olmo2Config (Olmo2Config model)
- olmo3 — Olmo3Config (Olmo3Config model)
- olmo_hybrid — OlmoHybridConfig (OlmoHybridConfig model)
- olmoe — OlmoeConfig (OlmoeConfig model)
- omdet-turbo — OmDetTurboConfig (OmDetTurboConfig model)
- oneformer — OneFormerConfig (OneFormerConfig model)
- openai-gpt — OpenAIGPTConfig (OpenAIGPTConfig model)
- opt — OPTConfig (OPTConfig model)
- ovis2 — Ovis2Config (Ovis2Config model)
- owlv2 — Owlv2Config (Owlv2Config model)
- owlv2_text_model — Owlv2TextConfig (Owlv2TextConfig model)
- owlv2_vision_model — Owlv2VisionConfig (Owlv2VisionConfig model)
- owlvit — OwlViTConfig (OwlViTConfig model)
- owlvit_text_model — OwlViTTextConfig (OwlViTTextConfig model)
- owlvit_vision_model — OwlViTVisionConfig (OwlViTVisionConfig model)
- paddleocr_vl — PaddleOCRVLConfig (PaddleOCRVLConfig model)
- paddleocr_vl_text — PaddleOCRTextConfig (PaddleOCRTextConfig model)
- paddleocr_vl_vision — PaddleOCRVisionConfig (PaddleOCRVisionConfig model)
- paligemma — PaliGemmaConfig (PaliGemmaConfig model)
- parakeet_ctc — ParakeetCTCConfig (ParakeetCTCConfig model)
- parakeet_encoder — ParakeetEncoderConfig (ParakeetEncoderConfig model)
- patchtsmixer — PatchTSMixerConfig (PatchTSMixerConfig model)
- patchtst — PatchTSTConfig (PatchTSTConfig model)
- pe_audio — PeAudioConfig (PeAudioConfig model)
- pe_audio_encoder — PeAudioEncoderConfig (PeAudioEncoderConfig model)
- pe_audio_video — PeAudioVideoConfig (PeAudioVideoConfig model)
- pe_audio_video_encoder — PeAudioVideoEncoderConfig (PeAudioVideoEncoderConfig model)
- pe_video — PeVideoConfig (PeVideoConfig model)
- pe_video_encoder — PeVideoEncoderConfig (PeVideoEncoderConfig model)
- pegasus — PegasusConfig (PegasusConfig model)
- pegasus_x — PegasusXConfig (PegasusXConfig model)
- perceiver — PerceiverConfig (PerceiverConfig model)
- perception_lm — PerceptionLMConfig (PerceptionLMConfig model)
- persimmon — PersimmonConfig (PersimmonConfig model)
- phi — PhiConfig (PhiConfig model)
- phi3 — Phi3Config (Phi3Config model)
- phi4_multimodal — Phi4MultimodalConfig (Phi4MultimodalConfig model)
- phi4_multimodal_audio — Phi4MultimodalAudioConfig (Phi4MultimodalAudioConfig model)
- phi4_multimodal_vision — Phi4MultimodalVisionConfig (Phi4MultimodalVisionConfig model)
- phimoe — PhimoeConfig (PhimoeConfig model)
- pi0 — PI0Config (PI0Config model)
- pix2struct — Pix2StructConfig (Pix2StructConfig model)
- pix2struct_text_model — Pix2StructTextConfig (Pix2StructTextConfig model)
- pix2struct_vision_model — Pix2StructVisionConfig (Pix2StructVisionConfig model)
- pixio — PixioConfig (PixioConfig model)
- pixtral — PixtralVisionConfig (PixtralVisionConfig model)
- plbart — PLBartConfig (PLBartConfig model)
- poolformer — PoolFormerConfig (PoolFormerConfig model)
- pop2piano — Pop2PianoConfig (Pop2PianoConfig model)
- pp_chart2table — PPChart2TableConfig (PPChart2TableConfig model)
- pp_doclayout_v2 — PPDocLayoutV2Config (PPDocLayoutV2Config model)
- pp_doclayout_v3 — PPDocLayoutV3Config (PPDocLayoutV3Config model)
- pp_lcnet — PPLCNetConfig (PPLCNetConfig model)
- pp_lcnet_v3 — PPLCNetV3Config (PPLCNetV3Config model)
- pp_ocrv5_mobile_det — PPOCRV5MobileDetConfig (PPOCRV5MobileDetConfig model)
- pp_ocrv5_mobile_rec — PPOCRV5MobileRecConfig (PPOCRV5MobileRecConfig model)
- pp_ocrv5_server_det — PPOCRV5ServerDetConfig (PPOCRV5ServerDetConfig model)
- pp_ocrv5_server_rec — PPOCRV5ServerRecConfig (PPOCRV5ServerRecConfig model)
- prompt_depth_anything — PromptDepthAnythingConfig (PromptDepthAnythingConfig model)
- prophetnet — ProphetNetConfig (ProphetNetConfig model)
- pvt — PvtConfig (PvtConfig model)
- pvt_v2 — PvtV2Config (PvtV2Config model)
- qwen2 — Qwen2Config (Qwen2Config model)
- qwen2_5_omni — Qwen2_5OmniConfig (Qwen2_5OmniConfig model)
- qwen2_5_omni_audio_encoder — Qwen2_5OmniAudioEncoderConfig (Qwen2_5OmniAudioEncoderConfig model)
- qwen2_5_omni_bigvgan — Qwen2_5OmniBigVGANConfig (Qwen2_5OmniBigVGANConfig model)
- qwen2_5_omni_dit — Qwen2_5OmniDiTConfig (Qwen2_5OmniDiTConfig model)
- qwen2_5_omni_talker — Qwen2_5OmniTalkerConfig (Qwen2_5OmniTalkerConfig model)
- qwen2_5_omni_text — Qwen2_5OmniTextConfig (Qwen2_5OmniTextConfig model)
- qwen2_5_omni_thinker — Qwen2_5OmniThinkerConfig (Qwen2_5OmniThinkerConfig model)
- qwen2_5_omni_token2wav — Qwen2_5OmniToken2WavConfig (Qwen2_5OmniToken2WavConfig model)
- qwen2_5_omni_vision_encoder — Qwen2_5OmniVisionEncoderConfig (Qwen2_5OmniVisionEncoderConfig model)
- qwen2_5_vl — Qwen2_5_VLConfig (Qwen2_5_VLConfig model)
- qwen2_5_vl_text — Qwen2_5_VLTextConfig (Qwen2_5_VLTextConfig model)
- qwen2_5_vl_vision — Qwen2_5_VLVisionConfig (Qwen2_5_VLVisionConfig model)
- qwen2_audio — Qwen2AudioConfig (Qwen2AudioConfig model)
- qwen2_audio_encoder — Qwen2AudioEncoderConfig (Qwen2AudioEncoderConfig model)
- qwen2_moe — Qwen2MoeConfig (Qwen2MoeConfig model)
- qwen2_vl — Qwen2VLConfig (Qwen2VLConfig model)
- qwen2_vl_text — Qwen2VLTextConfig (Qwen2VLTextConfig model)
- qwen2_vl_vision — Qwen2VLVisionConfig (Qwen2VLVisionConfig model)
- qwen3 — Qwen3Config (Qwen3Config model)
- qwen3_5 — Qwen3_5Config (Qwen3_5Config model)
- qwen3_5_moe — Qwen3_5MoeConfig (Qwen3_5MoeConfig model)
- qwen3_5_moe_text — Qwen3_5MoeTextConfig (Qwen3_5MoeTextConfig model)
- qwen3_5_moe_vision — Qwen3_5MoeVisionConfig (Qwen3_5MoeVisionConfig model)
- qwen3_5_text — Qwen3_5TextConfig (Qwen3_5TextConfig model)
- qwen3_5_vision — Qwen3_5VisionConfig (Qwen3_5VisionConfig model)
- qwen3_moe — Qwen3MoeConfig (Qwen3MoeConfig model)
- qwen3_next — Qwen3NextConfig (Qwen3NextConfig model)
- qwen3_omni_moe — Qwen3OmniMoeConfig (Qwen3OmniMoeConfig model)
- qwen3_omni_moe_audio_encoder — Qwen3OmniMoeAudioEncoderConfig (Qwen3OmniMoeAudioEncoderConfig model)
- qwen3_omni_moe_talker_code_predictor — Qwen3OmniMoeTalkerCodePredictorConfig (Qwen3OmniMoeTalkerCodePredictorConfig model)
- qwen3_omni_moe_talker_text — Qwen3OmniMoeTalkerTextConfig (Qwen3OmniMoeTalkerTextConfig model)
- qwen3_omni_moe_text — Qwen3OmniMoeTextConfig (Qwen3OmniMoeTextConfig model)
- qwen3_omni_moe_thinker — Qwen3OmniMoeThinkerConfig (Qwen3OmniMoeThinkerConfig model)
- qwen3_omni_moe_vision_encoder — Qwen3OmniMoeVisionEncoderConfig (Qwen3OmniMoeVisionEncoderConfig model)
- qwen3_vl — Qwen3VLConfig (Qwen3VLConfig model)
- qwen3_vl_moe — Qwen3VLMoeConfig (Qwen3VLMoeConfig model)
- qwen3_vl_moe_text — Qwen3VLMoeTextConfig (Qwen3VLMoeTextConfig model)
- qwen3_vl_moe_vision — Qwen3VLMoeVisionConfig (Qwen3VLMoeVisionConfig model)
- qwen3_vl_text — Qwen3VLTextConfig (Qwen3VLTextConfig model)
- qwen3_vl_vision — Qwen3VLVisionConfig (Qwen3VLVisionConfig model)
- rag — RagConfig (RagConfig model)
- recurrent_gemma — RecurrentGemmaConfig (RecurrentGemmaConfig model)
- reformer — ReformerConfig (ReformerConfig model)
- regnet — RegNetConfig (RegNetConfig model)
- rembert — RemBertConfig (RemBertConfig model)
- resnet — ResNetConfig (ResNetConfig model)
- roberta — RobertaConfig (RobertaConfig model)
- roberta-prelayernorm — RobertaPreLayerNormConfig (RobertaPreLayerNormConfig model)
- roc_bert — RoCBertConfig (RoCBertConfig model)
- roformer — RoFormerConfig (RoFormerConfig model)
- rt_detr — RTDetrConfig (RTDetrConfig model)
- rt_detr_resnet — RTDetrResNetConfig (RTDetrResNetConfig model)
- rt_detr_v2 — RTDetrV2Config (RTDetrV2Config model)
- rwkv — RwkvConfig (RwkvConfig model)
- sam — SamConfig (SamConfig model)
- sam2 — Sam2Config (Sam2Config model)
- sam2_hiera_det_model — Sam2HieraDetConfig (Sam2HieraDetConfig model)
- sam2_video — Sam2VideoConfig (Sam2VideoConfig model)
- sam2_vision_model — Sam2VisionConfig (Sam2VisionConfig model)
- sam3 — Sam3Config (Sam3Config model)
- sam3_detr_decoder — Sam3DETRDecoderConfig (Sam3DETRDecoderConfig model)
- sam3_detr_encoder — Sam3DETREncoderConfig (Sam3DETREncoderConfig model)
- sam3_geometry_encoder — Sam3GeometryEncoderConfig (Sam3GeometryEncoderConfig model)
- sam3_lite_text — Sam3LiteTextConfig (Sam3LiteTextConfig model)
- sam3_lite_text_detr_decoder — Sam3LiteTextDETRDecoderConfig (Sam3LiteTextDETRDecoderConfig model)
- sam3_lite_text_detr_encoder — Sam3LiteTextDETREncoderConfig (Sam3LiteTextDETREncoderConfig model)
- sam3_lite_text_geometry_encoder — Sam3LiteTextGeometryEncoderConfig (Sam3LiteTextGeometryEncoderConfig model)
- sam3_lite_text_mask_decoder — Sam3LiteTextMaskDecoderConfig (Sam3LiteTextMaskDecoderConfig model)
- sam3_lite_text_text_model — Sam3LiteTextTextConfig (Sam3LiteTextTextConfig model)
- sam3_mask_decoder — Sam3MaskDecoderConfig (Sam3MaskDecoderConfig model)
- sam3_tracker — Sam3TrackerConfig (Sam3TrackerConfig model)
- sam3_tracker_video — Sam3TrackerVideoConfig (Sam3TrackerVideoConfig model)
- sam3_video — Sam3VideoConfig (Sam3VideoConfig model)
- sam3_vision_model — Sam3VisionConfig (Sam3VisionConfig model)
- sam3_vit_model — Sam3ViTConfig (Sam3ViTConfig model)
- sam_hq — SamHQConfig (SamHQConfig model)
- sam_hq_vision_model — SamHQVisionConfig (SamHQVisionConfig model)
- sam_vision_model — SamVisionConfig (SamVisionConfig model)
- seamless_m4t — SeamlessM4TConfig (SeamlessM4TConfig model)
- seamless_m4t_v2 — SeamlessM4Tv2Config (SeamlessM4Tv2Config model)
- seed_oss — SeedOssConfig (SeedOssConfig model)
- segformer — SegformerConfig (SegformerConfig model)
- seggpt — SegGptConfig (SegGptConfig model)
- sew — SEWConfig (SEWConfig model)
- sew-d — SEWDConfig (SEWDConfig model)
- shieldgemma2 — ShieldGemma2Config (ShieldGemma2Config model)
- siglip — SiglipConfig (SiglipConfig model)
- siglip2 — Siglip2Config (Siglip2Config model)
- siglip2_text_model — Siglip2TextConfig (Siglip2TextConfig model)
- siglip2_vision_model — Siglip2VisionConfig (Siglip2VisionConfig model)
- siglip_text_model — SiglipTextConfig (SiglipTextConfig model)
- siglip_vision_model — SiglipVisionConfig (SiglipVisionConfig model)
- slanext — SLANeXtConfig (SLANeXtConfig model)
- smollm3 — SmolLM3Config (SmolLM3Config model)
- smolvlm — SmolVLMConfig (SmolVLMConfig model)
- smolvlm_vision — SmolVLMVisionConfig (SmolVLMVisionConfig model)
- solar_open — SolarOpenConfig (SolarOpenConfig model)
- speech-encoder-decoder — SpeechEncoderDecoderConfig (SpeechEncoderDecoderConfig model)
- speech_to_text — Speech2TextConfig (Speech2TextConfig model)
- speecht5 — SpeechT5Config (SpeechT5Config model)
- speecht5_hifigan — SpeechT5HifiGanConfig (SpeechT5HifiGanConfig model)
- splinter — SplinterConfig (SplinterConfig model)
- squeezebert — SqueezeBertConfig (SqueezeBertConfig model)
- stablelm — StableLmConfig (StableLmConfig model)
- starcoder2 — Starcoder2Config (Starcoder2Config model)
- superglue — SuperGlueConfig (SuperGlueConfig model)
- superpoint — SuperPointConfig (SuperPointConfig model)
- swiftformer — SwiftFormerConfig (SwiftFormerConfig model)
- swin — SwinConfig (SwinConfig model)
- swin2sr — Swin2SRConfig (Swin2SRConfig model)
- swinv2 — Swinv2Config (Swinv2Config model)
- switch_transformers — SwitchTransformersConfig (SwitchTransformersConfig model)
- t5 — T5Config (T5Config model)
- t5_gemma_module — T5GemmaModuleConfig (T5GemmaModuleConfig model)
- t5gemma — T5GemmaConfig (T5GemmaConfig model)
- t5gemma2 — T5Gemma2Config (T5Gemma2Config model)
- t5gemma2_decoder — T5Gemma2DecoderConfig (T5Gemma2DecoderConfig model)
- t5gemma2_encoder — T5Gemma2EncoderConfig (T5Gemma2EncoderConfig model)
- t5gemma2_text — T5Gemma2TextConfig (T5Gemma2TextConfig model)
- table-transformer — TableTransformerConfig (TableTransformerConfig model)
- tapas — TapasConfig (TapasConfig model)
- textnet — TextNetConfig (TextNetConfig model)
- time_series_transformer — TimeSeriesTransformerConfig (TimeSeriesTransformerConfig model)
- timesfm — TimesFmConfig (TimesFmConfig model)
- timesfm2_5 — TimesFm2_5Config (TimesFm2_5Config model)
- timesformer — TimesformerConfig (TimesformerConfig model)
- timm_backbone — TimmBackboneConfig (TimmBackboneConfig model)
- timm_wrapper — TimmWrapperConfig (TimmWrapperConfig model)
- trocr — TrOCRConfig (TrOCRConfig model)
- tvp — TvpConfig (TvpConfig model)
- udop — UdopConfig (UdopConfig model)
- umt5 — UMT5Config (UMT5Config model)
- unispeech — UniSpeechConfig (UniSpeechConfig model)
- unispeech-sat — UniSpeechSatConfig (UniSpeechSatConfig model)
- univnet — UnivNetConfig (UnivNetConfig model)
- upernet — UperNetConfig (UperNetConfig model)
- uvdoc — UVDocConfig (UVDocConfig model)
- uvdoc_backbone — UVDocBackboneConfig (UVDocBackboneConfig model)
- vaultgemma — VaultGemmaConfig (VaultGemmaConfig model)
- vibevoice_acoustic_tokenizer — VibeVoiceAcousticTokenizerConfig (VibeVoiceAcousticTokenizerConfig model)
- vibevoice_acoustic_tokenizer_decoder — VibeVoiceAcousticTokenizerDecoderConfig (VibeVoiceAcousticTokenizerDecoderConfig model)
- vibevoice_acoustic_tokenizer_encoder — VibeVoiceAcousticTokenizerEncoderConfig (VibeVoiceAcousticTokenizerEncoderConfig model)
- vibevoice_asr — VibeVoiceAsrConfig (VibeVoiceAsrConfig model)
- video_llama_3 — VideoLlama3Config (VideoLlama3Config model)
- video_llama_3_vision — VideoLlama3VisionConfig (VideoLlama3VisionConfig model)
- video_llava — VideoLlavaConfig (VideoLlavaConfig model)
- videomae — VideoMAEConfig (VideoMAEConfig model)
- videomt — VideomtConfig (VideomtConfig model)
- vilt — ViltConfig (ViltConfig model)
- vipllava — VipLlavaConfig (VipLlavaConfig model)
- vision-encoder-decoder — VisionEncoderDecoderConfig (VisionEncoderDecoderConfig model)
- vision-text-dual-encoder — VisionTextDualEncoderConfig (VisionTextDualEncoderConfig model)
- visual_bert — VisualBertConfig (VisualBertConfig model)
- vit — ViTConfig (ViTConfig model)
- vit_mae — ViTMAEConfig (ViTMAEConfig model)
- vit_msn — ViTMSNConfig (ViTMSNConfig model)
- vitdet — VitDetConfig (VitDetConfig model)
- vitmatte — VitMatteConfig (VitMatteConfig model)
- vitpose — VitPoseConfig (VitPoseConfig model)
- vitpose_backbone —
VitPoseBackboneConfig(VitPoseBackboneConfig model) - vits — VitsConfig (VitsConfig model)
- vivit — VivitConfig (VivitConfig model)
- vjepa2 — VJEPA2Config (VJEPA2Config model)
- voxtral — VoxtralConfig (VoxtralConfig model)
- voxtral_encoder — VoxtralEncoderConfig (VoxtralEncoderConfig model)
- voxtral_realtime — VoxtralRealtimeConfig (VoxtralRealtimeConfig model)
- voxtral_realtime_encoder — VoxtralRealtimeEncoderConfig (VoxtralRealtimeEncoderConfig model)
- voxtral_realtime_text — VoxtralRealtimeTextConfig (VoxtralRealtimeTextConfig model)
- wav2vec2 — Wav2Vec2Config (Wav2Vec2Config model)
- wav2vec2-bert — Wav2Vec2BertConfig (Wav2Vec2BertConfig model)
- wav2vec2-conformer — Wav2Vec2ConformerConfig (Wav2Vec2ConformerConfig model)
- wavlm — WavLMConfig (WavLMConfig model)
- whisper — WhisperConfig (WhisperConfig model)
- xclip — XCLIPConfig (XCLIPConfig model)
- xclip_text_model — XCLIPTextConfig (XCLIPTextConfig model)
- xclip_vision_model — XCLIPVisionConfig (XCLIPVisionConfig model)
- xcodec — XcodecConfig (XcodecConfig model)
- xglm — XGLMConfig (XGLMConfig model)
- xlm — XLMConfig (XLMConfig model)
- xlm-roberta — XLMRobertaConfig (XLMRobertaConfig model)
- xlm-roberta-xl — XLMRobertaXLConfig (XLMRobertaXLConfig model)
- xlnet — XLNetConfig (XLNetConfig model)
- xlstm — xLSTMConfig (xLSTMConfig model)
- xmod — XmodConfig (XmodConfig model)
- yolos — YolosConfig (YolosConfig model)
- yoso — YosoConfig (YosoConfig model)
- youtu — YoutuConfig (YoutuConfig model)
- zamba — ZambaConfig (ZambaConfig model)
- zamba2 — Zamba2Config (Zamba2Config model)
- zoedepth — ZoeDepthConfig (ZoeDepthConfig model)
Examples:
>>> from transformers import AutoConfig
>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-uncased")
>>> # Download configuration from huggingface.co (user-uploaded) and cache.
>>> config = AutoConfig.from_pretrained("dbmdz/bert-base-german-cased")
>>> # If configuration file is in a directory (e.g., was saved using *save_pretrained('./test/saved_model/')*).
>>> config = AutoConfig.from_pretrained("./test/bert_saved_model/")
>>> # Load a specific configuration file.
>>> config = AutoConfig.from_pretrained("./test/bert_saved_model/my_configuration.json")
>>> # Change some config attributes when loading a pretrained config.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-uncased", output_attentions=True, foo=False)
>>> config.output_attentions
True
>>> config, unused_kwargs = AutoConfig.from_pretrained(
... "google-bert/bert-base-uncased", output_attentions=True, foo=False, return_unused_kwargs=True
... )
>>> config.output_attentions
True
>>> unused_kwargs
{'foo': False}register
< source >( model_type config exist_ok = False )
Parameters
- model_type (
str) — The model type like “bert” or “gpt”. - config (PreTrainedConfig) — The config to register.
Register a new configuration for this class.
AutoTokenizer
This is a generic tokenizer class that will be instantiated as one of the tokenizer classes of the library when created with the AutoTokenizer.from_pretrained() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_pretrained
< source >( pretrained_model_name_or_path *inputs **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a predefined tokenizer hosted inside a model repo on huggingface.co.
- A path to a directory containing vocabulary files required by the tokenizer, for instance saved
using the save_pretrained() method, e.g.,
./my_model_directory/. - a path to a single saved vocabulary file if and only if the tokenizer only requires a
single vocabulary file (like Bert or XLNet), e.g.:
./my_model_directory/vocab.txt. (Not applicable to all derived classes)
- inputs (additional positional arguments, optional) —
Will be passed along to the Tokenizer
__init__()method. - config (PreTrainedConfig, optional) — The configuration object used to determine the tokenizer class to instantiate.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download the model weights and configuration files and override the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - subfolder (
str, optional) — In case the relevant files are located inside a subfolder of the model repo on huggingface.co (e.g. for facebook/rag-token-base), specify it here. - tokenizer_type (
str, optional) — Tokenizer type to be loaded. - backend (
str, optional, defaults to"tokenizers") — Backend to use for tokenization. Valid options are:"tokenizers": Use the HuggingFace tokenizers library backend (default)"sentencepiece": Use the SentencePiece backend
- trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - kwargs (additional keyword arguments, optional) —
Will be passed to the Tokenizer
__init__()method. Can be used to set special tokens likebos_token,eos_token,unk_token,sep_token,pad_token,cls_token,mask_token,additional_special_tokens. See parameters in the__init__()for more details.
Instantiate one of the tokenizer classes of the library from a pretrained model vocabulary.
The tokenizer class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- aimv2 — CLIPTokenizer (Aimv2Config model)
- albert — AlbertTokenizer (AlbertConfig model)
- align — BertTokenizer (AlignConfig model)
- audioflamingo3 — Qwen2Tokenizer (AudioFlamingo3Config model)
- aya_vision — CohereTokenizer (AyaVisionConfig model)
- bark — BertTokenizer (BarkConfig model)
- bart — RobertaTokenizer (BartConfig model)
- bert — BertTokenizer (BertConfig model)
- bert-generation — BertGenerationTokenizer (BertGenerationConfig model)
- big_bird — BigBirdTokenizer (BigBirdConfig model)
- bigbird_pegasus — PegasusTokenizer (BigBirdPegasusConfig model)
- biogpt — BioGptTokenizer (BioGptConfig model)
- blenderbot — BlenderbotTokenizer (BlenderbotConfig model)
- blenderbot-small — BlenderbotSmallTokenizer (BlenderbotSmallConfig model)
- blip — BertTokenizer (BlipConfig model)
- blip-2 — GPT2Tokenizer (Blip2Config model)
- bridgetower — RobertaTokenizer (BridgeTowerConfig model)
- bros — BertTokenizer (BrosConfig model)
- camembert — CamembertTokenizer (CamembertConfig model)
- canine — CanineTokenizer (CanineConfig model)
- chameleon — TokenizersBackend (ChameleonConfig model)
- chinese_clip — BertTokenizer (ChineseCLIPConfig model)
- clap — RobertaTokenizer (ClapConfig model)
- clip — CLIPTokenizer (CLIPConfig model)
- clipseg — CLIPTokenizer (CLIPSegConfig model)
- clvp — ClvpTokenizer (ClvpConfig model)
- codegen — GPT2Tokenizer (CodeGenConfig model)
- cohere — CohereTokenizer (CohereConfig model)
- cohere2 — CohereTokenizer (Cohere2Config model)
- cohere_asr — TokenizersBackend (CohereAsrConfig model)
- colqwen2 — Qwen2Tokenizer (ColQwen2Config model)
- convbert — BertTokenizer (ConvBertConfig model)
- cpmant — CpmAntTokenizer (CpmAntConfig model)
- ctrl — CTRLTokenizer (CTRLConfig model)
- data2vec-audio — Wav2Vec2CTCTokenizer (Data2VecAudioConfig model)
- data2vec-text — RobertaTokenizer (Data2VecTextConfig model)
- dbrx — GPT2Tokenizer (DbrxConfig model)
- deberta — DebertaTokenizer (DebertaConfig model)
- deberta-v2 — DebertaV2Tokenizer (DebertaV2Config model)
- deepseek_v2 — TokenizersBackend (DeepseekV2Config model)
- deepseek_v3 — TokenizersBackend (DeepseekV3Config model)
- deepseek_vl — TokenizersBackend (DeepseekVLConfig model)
- deepseek_vl_hybrid — TokenizersBackend (DeepseekVLHybridConfig model)
- dia — DiaTokenizer (DiaConfig model)
- distilbert — BertTokenizer (DistilBertConfig model)
- dpr — DPRQuestionEncoderTokenizer (DPRConfig model)
- electra — BertTokenizer (ElectraConfig model)
- emu3 — GPT2Tokenizer (Emu3Config model)
- ernie — BertTokenizer (ErnieConfig model)
- esm — EsmTokenizer (EsmConfig model)
- falcon_mamba — GPTNeoXTokenizer (FalconMambaConfig model)
- fastspeech2_conformer —
None(FastSpeech2ConformerConfig model) - flaubert — FlaubertTokenizer (FlaubertConfig model)
- flava — BertTokenizer (FlavaConfig model)
- flex_olmo — GPT2Tokenizer (FlexOlmoConfig model)
- florence2 — BartTokenizer (Florence2Config model)
- fnet — FNetTokenizer (FNetConfig model)
- fsmt — FSMTTokenizer (FSMTConfig model)
- funnel — FunnelTokenizer (FunnelConfig model)
- fuyu — TokenizersBackend (FuyuConfig model)
- gemma — GemmaTokenizer (GemmaConfig model)
- gemma2 — GemmaTokenizer (Gemma2Config model)
- gemma3 — GemmaTokenizer (Gemma3Config model)
- gemma3_text — GemmaTokenizer (Gemma3TextConfig model)
- gemma3n — GemmaTokenizer (Gemma3nConfig model)
- gemma3n_text — GemmaTokenizer (Gemma3nTextConfig model)
- git — BertTokenizer (GitConfig model)
- glm — TokenizersBackend (GlmConfig model)
- glm4 — TokenizersBackend (Glm4Config model)
- glm4_moe — TokenizersBackend (Glm4MoeConfig model)
- glm4_moe_lite — TokenizersBackend (Glm4MoeLiteConfig model)
- glm4v — TokenizersBackend (Glm4vConfig model)
- glm4v_moe — TokenizersBackend (Glm4vMoeConfig model)
- glm_image — TokenizersBackend (GlmImageConfig model)
- glmasr — TokenizersBackend (GlmAsrConfig model)
- got_ocr2 — TokenizersBackend (GotOcr2Config model)
- gpt-sw3 — GPTSw3Tokenizer (GPT2Config model)
- gpt2 — GPT2Tokenizer (GPT2Config model)
- gpt_bigcode — GPT2Tokenizer (GPTBigCodeConfig model)
- gpt_neo — GPT2Tokenizer (GPTNeoConfig model)
- gpt_neox — GPTNeoXTokenizer (GPTNeoXConfig model)
- gpt_neox_japanese — GPTNeoXJapaneseTokenizer (GPTNeoXJapaneseConfig model)
- gptj — GPT2Tokenizer (GPTJConfig model)
- granite — GPT2Tokenizer (GraniteConfig model)
- granitemoe — GPT2Tokenizer (GraniteMoeConfig model)
- granitemoehybrid — GPT2Tokenizer (GraniteMoeHybridConfig model)
- granitemoeshared — GPT2Tokenizer (GraniteMoeSharedConfig model)
- grounding-dino — BertTokenizer (GroundingDinoConfig model)
- groupvit — CLIPTokenizer (GroupViTConfig model)
- hubert — Wav2Vec2CTCTokenizer (HubertConfig model)
- ibert — RobertaTokenizer (IBertConfig model)
- idefics — LlamaTokenizer (IdeficsConfig model)
- idefics2 — LlamaTokenizer (Idefics2Config model)
- instructblip — GPT2Tokenizer (InstructBlipConfig model)
- instructblipvideo — GPT2Tokenizer (InstructBlipVideoConfig model)
- internvl — Qwen2Tokenizer (InternVLConfig model)
- jais2 — GPT2Tokenizer (Jais2Config model)
- jamba — TokenizersBackend (JambaConfig model)
- janus — TokenizersBackend (JanusConfig model)
- jina_embeddings_v3 — XLMRobertaTokenizer (JinaEmbeddingsV3Config model)
- kosmos-2 — XLMRobertaTokenizer (Kosmos2Config model)
- lasr_ctc — LasrTokenizer (LasrCTCConfig model)
- lasr_encoder — LasrTokenizer (LasrEncoderConfig model)
- layoutlm — BertTokenizer (LayoutLMConfig model)
- layoutlmv2 — LayoutLMv2Tokenizer (LayoutLMv2Config model)
- layoutlmv3 — LayoutLMv3Tokenizer (LayoutLMv3Config model)
- layoutxlm — LayoutXLMTokenizer (LayoutXLMConfig model)
- led — LEDTokenizer (LEDConfig model)
- lighton_ocr — Qwen2TokenizerFast (LightOnOcrConfig model)
- lilt — RobertaTokenizer (LiltConfig model)
- llava — TokenizersBackend (LlavaConfig model)
- llava_next — TokenizersBackend (LlavaNextConfig model)
- longformer — RobertaTokenizer (LongformerConfig model)
- luke — LukeTokenizer (LukeConfig model)
- lxmert — LxmertTokenizer (LxmertConfig model)
- m2m_100 — M2M100Tokenizer (M2M100Config model)
- mamba — GPTNeoXTokenizer (MambaConfig model)
- mamba2 — GPTNeoXTokenizer (Mamba2Config model)
- marian — MarianTokenizer (MarianConfig model)
- markuplm — MarkupLMTokenizer (MarkupLMConfig model)
- mbart — MBartTokenizer (MBartConfig model)
- megatron-bert — BertTokenizer (MegatronBertConfig model)
- metaclip_2 — XLMRobertaTokenizer (MetaClip2Config model)
- mgp-str — MgpstrTokenizer (MgpstrConfig model)
- minimax_m2 — TokenizersBackend (MiniMaxM2Config model)
- ministral — MistralCommonBackend (MinistralConfig model)
- ministral3 — MistralCommonBackend (Ministral3Config model)
- mistral — MistralCommonBackend (MistralConfig model)
- mistral3 — MistralCommonBackend (Mistral3Config model)
- mixtral — MistralCommonBackend (MixtralConfig model)
- mm-grounding-dino — BertTokenizer (MMGroundingDinoConfig model)
- mobilebert — MobileBertTokenizer (MobileBertConfig model)
- modernbert — TokenizersBackend (ModernBertConfig model)
- mpnet — MPNetTokenizer (MPNetConfig model)
- mpt — GPTNeoXTokenizer (MptConfig model)
- mra — RobertaTokenizer (MraConfig model)
- mt5 — T5Tokenizer (MT5Config model)
- musicgen — T5Tokenizer (MusicgenConfig model)
- musicgen_melody — T5Tokenizer (MusicgenMelodyConfig model)
- mvp — MvpTokenizer (MvpConfig model)
- nemotron — TokenizersBackend (NemotronConfig model)
- nllb-moe — NllbTokenizer (NllbMoeConfig model)
- nomic_bert — BertTokenizer (NomicBertConfig model)
- nougat — NougatTokenizer (NougatConfig model)
- nystromformer — AlbertTokenizer (NystromformerConfig model)
- olmo — GPTNeoXTokenizer (OlmoConfig model)
- olmo2 — GPTNeoXTokenizer (Olmo2Config model)
- olmo3 — TokenizersBackend (Olmo3Config model)
- olmo_hybrid — TokenizersBackend (OlmoHybridConfig model)
- olmoe — GPTNeoXTokenizer (OlmoeConfig model)
- omdet-turbo — CLIPTokenizer (OmDetTurboConfig model)
- oneformer — CLIPTokenizer (OneFormerConfig model)
- openai-gpt — OpenAIGPTTokenizer (OpenAIGPTConfig model)
- opt — GPT2Tokenizer (OPTConfig model)
- ovis2 — Qwen2Tokenizer (Ovis2Config model)
- owlv2 — CLIPTokenizer (Owlv2Config model)
- owlvit — CLIPTokenizer (OwlViTConfig model)
- pegasus — PegasusTokenizer (PegasusConfig model)
- pegasus_x — PegasusTokenizer (PegasusXConfig model)
- perceiver — PerceiverTokenizer (PerceiverConfig model)
- phi — GPT2Tokenizer (PhiConfig model)
- phi3 — TokenizersBackend (Phi3Config model)
- phimoe — TokenizersBackend (PhimoeConfig model)
- pix2struct — T5Tokenizer (Pix2StructConfig model)
- pixtral — MistralCommonBackend (PixtralVisionConfig model)
- plbart — PLBartTokenizer (PLBartConfig model)
- prophetnet — ProphetNetTokenizer (ProphetNetConfig model)
- qwen2 — Qwen2Tokenizer (Qwen2Config model)
- qwen2_5_omni — Qwen2Tokenizer (Qwen2_5OmniConfig model)
- qwen2_5_vl — Qwen2Tokenizer (Qwen2_5_VLConfig model)
- qwen2_audio — Qwen2Tokenizer (Qwen2AudioConfig model)
- qwen2_moe — Qwen2Tokenizer (Qwen2MoeConfig model)
- qwen2_vl — Qwen2Tokenizer (Qwen2VLConfig model)
- qwen3 — Qwen2Tokenizer (Qwen3Config model)
- qwen3_5 — Qwen3_5Tokenizer (Qwen3_5Config model)
- qwen3_5_moe — Qwen3_5Tokenizer (Qwen3_5MoeConfig model)
- qwen3_moe — Qwen2Tokenizer (Qwen3MoeConfig model)
- qwen3_next — Qwen2Tokenizer (Qwen3NextConfig model)
- qwen3_omni_moe — Qwen2Tokenizer (Qwen3OmniMoeConfig model)
- qwen3_vl — Qwen2Tokenizer (Qwen3VLConfig model)
- qwen3_vl_moe — Qwen2Tokenizer (Qwen3VLMoeConfig model)
- rag — RagTokenizer (RagConfig model)
- recurrent_gemma — GemmaTokenizer (RecurrentGemmaConfig model)
- reformer — ReformerTokenizer (ReformerConfig model)
- rembert — RemBertTokenizer (RemBertConfig model)
- roberta — RobertaTokenizer (RobertaConfig model)
- roberta-prelayernorm — RobertaTokenizer (RobertaPreLayerNormConfig model)
- roc_bert — RoCBertTokenizer (RoCBertConfig model)
- roformer — RoFormerTokenizer (RoFormerConfig model)
- rwkv — GPTNeoXTokenizer (RwkvConfig model)
- sam3 — CLIPTokenizer (Sam3Config model)
- sam3_video — CLIPTokenizer (Sam3VideoConfig model)
- seamless_m4t — SeamlessM4TTokenizer (SeamlessM4TConfig model)
- seamless_m4t_v2 — SeamlessM4TTokenizer (SeamlessM4Tv2Config model)
- shieldgemma2 — GemmaTokenizer (ShieldGemma2Config model)
- siglip — SiglipTokenizer (SiglipConfig model)
- siglip2 — Siglip2Tokenizer (Siglip2Config model)
- speech_to_text — Speech2TextTokenizer (Speech2TextConfig model)
- speecht5 — SpeechT5Tokenizer (SpeechT5Config model)
- splinter — SplinterTokenizer (SplinterConfig model)
- squeezebert — BertTokenizer (SqueezeBertConfig model)
- stablelm — GPTNeoXTokenizer (StableLmConfig model)
- starcoder2 — GPT2Tokenizer (Starcoder2Config model)
- switch_transformers — T5Tokenizer (SwitchTransformersConfig model)
- t5 — T5Tokenizer (T5Config model)
- t5gemma — GemmaTokenizer (T5GemmaConfig model)
- tapas — TapasTokenizer (TapasConfig model)
- trocr — XLMRobertaTokenizer (TrOCRConfig model)
- tvp — BertTokenizer (TvpConfig model)
- udop — UdopTokenizer (UdopConfig model)
- umt5 — T5Tokenizer (UMT5Config model)
- unispeech — Wav2Vec2CTCTokenizer (UniSpeechConfig model)
- unispeech-sat — Wav2Vec2CTCTokenizer (UniSpeechSatConfig model)
- vilt — BertTokenizer (ViltConfig model)
- vipllava — TokenizersBackend (VipLlavaConfig model)
- visual_bert — BertTokenizer (VisualBertConfig model)
- vits — VitsTokenizer (VitsConfig model)
- voxtral — MistralCommonBackend (VoxtralConfig model)
- voxtral_realtime — MistralCommonBackend (VoxtralRealtimeConfig model)
- wav2vec2 — Wav2Vec2CTCTokenizer (Wav2Vec2Config model)
- wav2vec2-bert — Wav2Vec2CTCTokenizer (Wav2Vec2BertConfig model)
- wav2vec2-conformer — Wav2Vec2CTCTokenizer (Wav2Vec2ConformerConfig model)
- whisper — WhisperTokenizer (WhisperConfig model)
- xclip — CLIPTokenizer (XCLIPConfig model)
- xglm — XGLMTokenizer (XGLMConfig model)
- xlm — XLMTokenizer (XLMConfig model)
- xlm-roberta — XLMRobertaTokenizer (XLMRobertaConfig model)
- xlm-roberta-xl — XLMRobertaTokenizer (XLMRobertaXLConfig model)
- xlnet — XLNetTokenizer (XLNetConfig model)
- xlstm — GPTNeoXTokenizer (xLSTMConfig model)
- xmod — XLMRobertaTokenizer (XmodConfig model)
- yoso — AlbertTokenizer (YosoConfig model)
Examples:
>>> from transformers import AutoTokenizer
>>> # Download vocabulary from huggingface.co and cache.
>>> tokenizer = AutoTokenizer.from_pretrained("google-bert/bert-base-uncased")
>>> # Download vocabulary from huggingface.co (user-uploaded) and cache.
>>> tokenizer = AutoTokenizer.from_pretrained("dbmdz/bert-base-german-cased")
>>> # If vocabulary files are in a directory (e.g. tokenizer was saved using *save_pretrained('./test/saved_model/')*)
>>> # tokenizer = AutoTokenizer.from_pretrained("./test/bert_saved_model/")
>>> # Download vocabulary from huggingface.co and define model-specific arguments
>>> tokenizer = AutoTokenizer.from_pretrained("FacebookAI/roberta-base", add_prefix_space=True)
>>> # Explicitly use the tokenizers backend
>>> tokenizer = AutoTokenizer.from_pretrained("hf-internal-testing/llama-tokenizer", backend="tokenizers")
>>> # Explicitly use the sentencepiece backend
>>> tokenizer = AutoTokenizer.from_pretrained("hf-internal-testing/llama-tokenizer", backend="sentencepiece")register
< source >( config_class tokenizer_class = None slow_tokenizer_class = None fast_tokenizer_class = None exist_ok = False )
Parameters
- config_class (PreTrainedConfig) — The configuration corresponding to the model to register.
- tokenizer_class — The tokenizer class to register (V5 - preferred parameter).
- slow_tokenizer_class — (Deprecated) The slow tokenizer to register.
- fast_tokenizer_class — (Deprecated) The fast tokenizer to register.
Register a new tokenizer in this mapping.
AutoFeatureExtractor
This is a generic feature extractor class that will be instantiated as one of the feature extractor classes of the library when created with the AutoFeatureExtractor.from_pretrained() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_pretrained
< source >( pretrained_model_name_or_path **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — This can be either:- a string, the model id of a pretrained feature_extractor hosted inside a model repo on huggingface.co.
- a path to a directory containing a feature extractor file saved using the
save_pretrained() method, e.g.,
./my_model_directory/. - a path to a saved feature extractor JSON file, e.g.,
./my_model_directory/preprocessor_config.json.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model feature extractor should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force to (re-)download the feature extractor files and override the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}.The proxies are used on each request. - token (
stror bool, optional) — The token to use as HTTP bearer authorization for remote files. IfTrue, will use the token generated when runninghf auth login(stored in~/.huggingface). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - return_unused_kwargs (
bool, optional, defaults toFalse) — IfFalse, then this function returns just the final feature extractor object. IfTrue, then this functions returns aTuple(feature_extractor, unused_kwargs)where unused_kwargs is a dictionary consisting of the key/value pairs whose keys are not feature extractor attributes: i.e., the part ofkwargswhich has not been used to updatefeature_extractorand is otherwise ignored. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - kwargs (
dict[str, Any], optional) — The values in kwargs of any keys which are feature extractor attributes will be used to override the loaded values. Behavior concerning key/value pairs whose keys are not feature extractor attributes is controlled by thereturn_unused_kwargskeyword parameter.
Instantiate one of the feature extractor classes of the library from a pretrained model vocabulary.
The feature extractor class to instantiate is selected based on the model_type property of the config object
(either passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s
missing, by falling back to using pattern matching on pretrained_model_name_or_path:
- audio-spectrogram-transformer — ASTFeatureExtractor (ASTConfig model)
- audioflamingo3 — WhisperFeatureExtractor (AudioFlamingo3Config model)
- clap — ClapFeatureExtractor (ClapConfig model)
- clvp — ClvpFeatureExtractor (ClvpConfig model)
- cohere_asr — CohereAsrFeatureExtractor (CohereAsrConfig model)
- csm — EncodecFeatureExtractor (CsmConfig model)
- dac — DacFeatureExtractor (DacConfig model)
- data2vec-audio — Wav2Vec2FeatureExtractor (Data2VecAudioConfig model)
- dia — DiaFeatureExtractor (DiaConfig model)
- encodec — EncodecFeatureExtractor (EncodecConfig model)
- gemma3n — Gemma3nAudioFeatureExtractor (Gemma3nConfig model)
- gemma4 — Gemma4AudioFeatureExtractor (Gemma4Config model)
- glmasr — WhisperFeatureExtractor (GlmAsrConfig model)
- granite_speech — GraniteSpeechFeatureExtractor (GraniteSpeechConfig model)
- higgs_audio_v2_tokenizer — DacFeatureExtractor (HiggsAudioV2TokenizerConfig model)
- hubert — Wav2Vec2FeatureExtractor (HubertConfig model)
- kyutai_speech_to_text — KyutaiSpeechToTextFeatureExtractor (KyutaiSpeechToTextConfig model)
- lasr_ctc — LasrFeatureExtractor (LasrCTCConfig model)
- lasr_encoder — LasrFeatureExtractor (LasrEncoderConfig model)
- markuplm — MarkupLMFeatureExtractor (MarkupLMConfig model)
- mimi — EncodecFeatureExtractor (MimiConfig model)
- moonshine — Wav2Vec2FeatureExtractor (MoonshineConfig model)
- moshi — EncodecFeatureExtractor (MoshiConfig model)
- musicgen — EncodecFeatureExtractor (MusicgenConfig model)
- musicgen_melody — MusicgenMelodyFeatureExtractor (MusicgenMelodyConfig model)
- parakeet_ctc — ParakeetFeatureExtractor (ParakeetCTCConfig model)
- parakeet_encoder — ParakeetFeatureExtractor (ParakeetEncoderConfig model)
- pe_audio — PeAudioFeatureExtractor (PeAudioConfig model)
- pe_audio_video — PeAudioFeatureExtractor (PeAudioVideoConfig model)
- phi4_multimodal — Phi4MultimodalFeatureExtractor (Phi4MultimodalConfig model)
- pop2piano — Pop2PianoFeatureExtractor (Pop2PianoConfig model)
- qwen2_5_omni — WhisperFeatureExtractor (Qwen2_5OmniConfig model)
- qwen2_audio — WhisperFeatureExtractor (Qwen2AudioConfig model)
- qwen3_omni_moe — WhisperFeatureExtractor (Qwen3OmniMoeConfig model)
- seamless_m4t — SeamlessM4TFeatureExtractor (SeamlessM4TConfig model)
- seamless_m4t_v2 — SeamlessM4TFeatureExtractor (SeamlessM4Tv2Config model)
- sew — Wav2Vec2FeatureExtractor (SEWConfig model)
- sew-d — Wav2Vec2FeatureExtractor (SEWDConfig model)
- speech_to_text — Speech2TextFeatureExtractor (Speech2TextConfig model)
- speecht5 — SpeechT5FeatureExtractor (SpeechT5Config model)
- unispeech — Wav2Vec2FeatureExtractor (UniSpeechConfig model)
- unispeech-sat — Wav2Vec2FeatureExtractor (UniSpeechSatConfig model)
- univnet — UnivNetFeatureExtractor (UnivNetConfig model)
- vibevoice_acoustic_tokenizer — VibeVoiceAcousticTokenizerFeatureExtractor (VibeVoiceAcousticTokenizerConfig model)
- vibevoice_asr — VibeVoiceAcousticTokenizerFeatureExtractor (VibeVoiceAsrConfig model)
- voxtral — WhisperFeatureExtractor (VoxtralConfig model)
- voxtral_realtime — VoxtralRealtimeFeatureExtractor (VoxtralRealtimeConfig model)
- wav2vec2 — Wav2Vec2FeatureExtractor (Wav2Vec2Config model)
- wav2vec2-bert — Wav2Vec2FeatureExtractor (Wav2Vec2BertConfig model)
- wav2vec2-conformer — Wav2Vec2FeatureExtractor (Wav2Vec2ConformerConfig model)
- wavlm — Wav2Vec2FeatureExtractor (WavLMConfig model)
- whisper — WhisperFeatureExtractor (WhisperConfig model)
- xcodec — DacFeatureExtractor (XcodecConfig model)
Passing
token=Trueis required when you want to use a private model.
Examples:
>>> from transformers import AutoFeatureExtractor
>>> # Download feature extractor from huggingface.co and cache.
>>> feature_extractor = AutoFeatureExtractor.from_pretrained("facebook/wav2vec2-base-960h")
>>> # If feature extractor files are in a directory (e.g. feature extractor was saved using *save_pretrained('./test/saved_model/')*)
>>> # feature_extractor = AutoFeatureExtractor.from_pretrained("./test/saved_model/")register
< source >( config_class feature_extractor_class exist_ok = False )
Parameters
- config_class (PreTrainedConfig) — The configuration corresponding to the model to register.
- feature_extractor_class (
FeatureExtractorMixin) — The feature extractor to register.
Register a new feature extractor for this class.
AutoImageProcessor
This is a generic image processor class that will be instantiated as one of the image processor classes of the library when created with the AutoImageProcessor.from_pretrained() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_pretrained
< source >( pretrained_model_name_or_path *inputs **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — This can be either:- a string, the model id of a pretrained image_processor hosted inside a model repo on huggingface.co.
- a path to a directory containing a image processor file saved using the
save_pretrained() method, e.g.,
./my_model_directory/. - a path to a saved image processor JSON file, e.g.,
./my_model_directory/preprocessor_config.json.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model image processor should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force to (re-)download the image processor files and override the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}.The proxies are used on each request. - token (
stror bool, optional) — The token to use as HTTP bearer authorization for remote files. IfTrue, will use the token generated when runninghf auth login(stored in~/.huggingface). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - use_fast (
bool, optional, defaults toFalse) — Deprecated: Usebackend="torchvision"instead. This parameter is kept for backward compatibility. Use a fast torchvision-based image processor if it is supported for a given model. If a fast image processor is not available for a given model, a normal numpy-based image processor is returned instead. - backend (
str, optional, defaults toNone) — The backend to use for image processing. Can be:None: Automatically select the best available backend (torchvision if available, otherwise pil)"torchvision": Use Torchvision backend (GPU-accelerated, faster)"pil": Use PIL backend (portable, CPU-only)- Any custom backend name registered via
register()method
- return_unused_kwargs (
bool, optional, defaults toFalse) — IfFalse, then this function returns just the final image processor object. IfTrue, then this functions returns aTuple(image_processor, unused_kwargs)where unused_kwargs is a dictionary consisting of the key/value pairs whose keys are not image processor attributes: i.e., the part ofkwargswhich has not been used to updateimage_processorand is otherwise ignored. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - image_processor_filename (
str, optional, defaults to"config.json") — The name of the file in the model directory to use for the image processor config. - kwargs (
dict[str, Any], optional) — The values in kwargs of any keys which are image processor attributes will be used to override the loaded values. Behavior concerning key/value pairs whose keys are not image processor attributes is controlled by thereturn_unused_kwargskeyword parameter.
Instantiate one of the image processor classes of the library from a pretrained model vocabulary.
The image processor class to instantiate is selected based on the model_type property of the config object
(either passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s
missing, by falling back to using pattern matching on pretrained_model_name_or_path:
- aimv2 —
{'torchvision': 'CLIPImageProcessor', 'pil': 'CLIPImageProcessorPil'}(Aimv2Config model) - aimv2_vision_model —
{'torchvision': 'CLIPImageProcessor', 'pil': 'CLIPImageProcessorPil'}(Aimv2VisionConfig model) - align —
{'torchvision': 'EfficientNetImageProcessor', 'pil': 'EfficientNetImageProcessorPil'}(AlignConfig model) - altclip —
{'torchvision': 'CLIPImageProcessor', 'pil': 'CLIPImageProcessorPil'}(AltCLIPConfig model) - aria —
{'pil': 'AriaImageProcessorPil', 'torchvision': 'AriaImageProcessor'}(AriaConfig model) - aya_vision —
{'torchvision': 'GotOcr2ImageProcessor', 'pil': 'GotOcr2ImageProcessorPil'}(AyaVisionConfig model) - beit —
{'pil': 'BeitImageProcessorPil', 'torchvision': 'BeitImageProcessor'}(BeitConfig model) - bit —
{'pil': 'BitImageProcessorPil', 'torchvision': 'BitImageProcessor'}(BitConfig model) - blip —
{'pil': 'BlipImageProcessorPil', 'torchvision': 'BlipImageProcessor'}(BlipConfig model) - blip-2 —
{'torchvision': 'BlipImageProcessor', 'pil': 'BlipImageProcessorPil'}(Blip2Config model) - bridgetower —
{'pil': 'BridgeTowerImageProcessorPil', 'torchvision': 'BridgeTowerImageProcessor'}(BridgeTowerConfig model) - chameleon —
{'pil': 'ChameleonImageProcessorPil', 'torchvision': 'ChameleonImageProcessor'}(ChameleonConfig model) - chinese_clip —
{'pil': 'ChineseCLIPImageProcessorPil', 'torchvision': 'ChineseCLIPImageProcessor'}(ChineseCLIPConfig model) - chmv2 —
{'torchvision': 'CHMv2ImageProcessor'}(CHMv2Config model) - clip —
{'pil': 'CLIPImageProcessorPil', 'torchvision': 'CLIPImageProcessor'}(CLIPConfig model) - clipseg —
{'torchvision': 'ViTImageProcessor', 'pil': 'ViTImageProcessorPil'}(CLIPSegConfig model) - cohere2_vision —
{'torchvision': 'Cohere2VisionImageProcessor'}(Cohere2VisionConfig model) - colpali —
{'torchvision': 'SiglipImageProcessor', 'pil': 'SiglipImageProcessorPil'}(ColPaliConfig model) - colqwen2 —
{'torchvision': 'Qwen2VLImageProcessor', 'pil': 'Qwen2VLImageProcessorPil'}(ColQwen2Config model) - conditional_detr —
{'pil': 'ConditionalDetrImageProcessorPil', 'torchvision': 'ConditionalDetrImageProcessor'}(ConditionalDetrConfig model) - convnext —
{'pil': 'ConvNextImageProcessorPil', 'torchvision': 'ConvNextImageProcessor'}(ConvNextConfig model) - convnextv2 —
{'torchvision': 'ConvNextImageProcessor', 'pil': 'ConvNextImageProcessorPil'}(ConvNextV2Config model) - cvt —
{'torchvision': 'ConvNextImageProcessor', 'pil': 'ConvNextImageProcessorPil'}(CvtConfig model) - data2vec-vision —
{'torchvision': 'BeitImageProcessor', 'pil': 'BeitImageProcessorPil'}(Data2VecVisionConfig model) - deepseek_vl —
{'pil': 'DeepseekVLImageProcessorPil', 'torchvision': 'DeepseekVLImageProcessor'}(DeepseekVLConfig model) - deepseek_vl_hybrid —
{'pil': 'DeepseekVLHybridImageProcessorPil', 'torchvision': 'DeepseekVLHybridImageProcessor'}(DeepseekVLHybridConfig model) - deformable_detr —
{'pil': 'DeformableDetrImageProcessorPil', 'torchvision': 'DeformableDetrImageProcessor'}(DeformableDetrConfig model) - deit —
{'pil': 'DeiTImageProcessorPil', 'torchvision': 'DeiTImageProcessor'}(DeiTConfig model) - depth_anything —
{'torchvision': 'DPTImageProcessor', 'pil': 'DPTImageProcessorPil'}(DepthAnythingConfig model) - depth_pro —
{'torchvision': 'DepthProImageProcessor'}(DepthProConfig model) - detr —
{'pil': 'DetrImageProcessorPil', 'torchvision': 'DetrImageProcessor'}(DetrConfig model) - dinat —
{'torchvision': 'ViTImageProcessor', 'pil': 'ViTImageProcessorPil'}(DinatConfig model) - dinov2 —
{'torchvision': 'BitImageProcessor', 'pil': 'BitImageProcessorPil'}(Dinov2Config model) - dinov3_vit —
{'torchvision': 'DINOv3ViTImageProcessor'}(DINOv3ViTConfig model) - donut-swin —
{'torchvision': 'DonutImageProcessor', 'pil': 'DonutImageProcessorPil'}(DonutSwinConfig model) - dpt —
{'pil': 'DPTImageProcessorPil', 'torchvision': 'DPTImageProcessor'}(DPTConfig model) - edgetam —
{'torchvision': 'Sam2ImageProcessor'}(EdgeTamConfig model) - efficientloftr —
{'pil': 'EfficientLoFTRImageProcessorPil', 'torchvision': 'EfficientLoFTRImageProcessor'}(EfficientLoFTRConfig model) - efficientnet —
{'pil': 'EfficientNetImageProcessorPil', 'torchvision': 'EfficientNetImageProcessor'}(EfficientNetConfig model) - emu3 —
{'pil': 'Emu3ImageProcessor'}(Emu3Config model) - eomt —
{'pil': 'EomtImageProcessorPil', 'torchvision': 'EomtImageProcessor'}(EomtConfig model) - eomt_dinov3 —
{'torchvision': 'EomtImageProcessor', 'pil': 'EomtImageProcessorPil'}(EomtDinov3Config model) - ernie4_5_vl_moe —
{'pil': 'Ernie4_5_VLMoeImageProcessorPil', 'torchvision': 'Ernie4_5_VLMoeImageProcessor'}(Ernie4_5_VLMoeConfig model) - flava —
{'pil': 'FlavaImageProcessorPil', 'torchvision': 'FlavaImageProcessor'}(FlavaConfig model) - florence2 —
{'torchvision': 'CLIPImageProcessor', 'pil': 'CLIPImageProcessorPil'}(Florence2Config model) - focalnet —
{'torchvision': 'BitImageProcessor', 'pil': 'BitImageProcessorPil'}(FocalNetConfig model) - fuyu —
{'pil': 'FuyuImageProcessorPil', 'torchvision': 'FuyuImageProcessor'}(FuyuConfig model) - gemma3 —
{'pil': 'Gemma3ImageProcessorPil', 'torchvision': 'Gemma3ImageProcessor'}(Gemma3Config model) - gemma3n —
{'torchvision': 'SiglipImageProcessor', 'pil': 'SiglipImageProcessorPil'}(Gemma3nConfig model) - gemma4 —
{'pil': 'Gemma4ImageProcessorPil', 'torchvision': 'Gemma4ImageProcessor'}(Gemma4Config model) - git —
{'torchvision': 'CLIPImageProcessor', 'pil': 'CLIPImageProcessorPil'}(GitConfig model) - glm46v —
{'pil': 'Glm46VImageProcessorPil', 'torchvision': 'Glm46VImageProcessor'}(Glm46VConfig model) - glm4v —
{'pil': 'Glm4vImageProcessorPil', 'torchvision': 'Glm4vImageProcessor'}(Glm4vConfig model) - glm_image —
{'pil': 'GlmImageImageProcessorPil', 'torchvision': 'GlmImageImageProcessor'}(GlmImageConfig model) - glpn —
{'pil': 'GLPNImageProcessorPil', 'torchvision': 'GLPNImageProcessor'}(GLPNConfig model) - got_ocr2 —
{'pil': 'GotOcr2ImageProcessorPil', 'torchvision': 'GotOcr2ImageProcessor'}(GotOcr2Config model) - grounding-dino —
{'pil': 'GroundingDinoImageProcessorPil', 'torchvision': 'GroundingDinoImageProcessor'}(GroundingDinoConfig model) - groupvit —
{'torchvision': 'CLIPImageProcessor', 'pil': 'CLIPImageProcessorPil'}(GroupViTConfig model) - hiera —
{'torchvision': 'BitImageProcessor', 'pil': 'BitImageProcessorPil'}(HieraConfig model) - idefics —
{'pil': 'IdeficsImageProcessorPil', 'torchvision': 'IdeficsImageProcessor'}(IdeficsConfig model) - idefics2 —
{'pil': 'Idefics2ImageProcessorPil', 'torchvision': 'Idefics2ImageProcessor'}(Idefics2Config model) - idefics3 —
{'pil': 'Idefics3ImageProcessorPil', 'torchvision': 'Idefics3ImageProcessor'}(Idefics3Config model) - ijepa —
{'torchvision': 'ViTImageProcessor', 'pil': 'ViTImageProcessorPil'}(IJepaConfig model) - imagegpt —
{'pil': 'ImageGPTImageProcessorPil', 'torchvision': 'ImageGPTImageProcessor'}(ImageGPTConfig model) - instructblip —
{'torchvision': 'BlipImageProcessor', 'pil': 'BlipImageProcessorPil'}(InstructBlipConfig model) - internvl —
{'torchvision': 'GotOcr2ImageProcessor', 'pil': 'GotOcr2ImageProcessorPil'}(InternVLConfig model) - janus —
{'pil': 'JanusImageProcessorPil', 'torchvision': 'JanusImageProcessor'}(JanusConfig model) - kosmos-2 —
{'torchvision': 'CLIPImageProcessor', 'pil': 'CLIPImageProcessorPil'}(Kosmos2Config model) - kosmos-2.5 —
{'torchvision': 'Kosmos2_5ImageProcessor', 'pil': 'Kosmos2_5ImageProcessorPil'}(Kosmos2_5Config model) - layoutlmv2 —
{'pil': 'LayoutLMv2ImageProcessorPil', 'torchvision': 'LayoutLMv2ImageProcessor'}(LayoutLMv2Config model) - layoutlmv3 —
{'pil': 'LayoutLMv3ImageProcessorPil', 'torchvision': 'LayoutLMv3ImageProcessor'}(LayoutLMv3Config model) - layoutxlm —
{'torchvision': 'LayoutLMv2ImageProcessor', 'pil': 'LayoutLMv2ImageProcessorPil'}(LayoutXLMConfig model) - levit —
{'pil': 'LevitImageProcessorPil', 'torchvision': 'LevitImageProcessor'}(LevitConfig model) - lfm2_vl —
{'torchvision': 'Lfm2VlImageProcessor'}(Lfm2VlConfig model) - lightglue —
{'pil': 'LightGlueImageProcessorPil', 'torchvision': 'LightGlueImageProcessor'}(LightGlueConfig model) - lighton_ocr —
{'torchvision': 'PixtralImageProcessor', 'pil': 'PixtralImageProcessorPil'}(LightOnOcrConfig model) - llama4 —
{'torchvision': 'Llama4ImageProcessor'}(Llama4Config model) - llava —
{'pil': 'LlavaImageProcessorPil', 'torchvision': 'LlavaImageProcessor'}(LlavaConfig model) - llava_next —
{'pil': 'LlavaNextImageProcessorPil', 'torchvision': 'LlavaNextImageProcessor'}(LlavaNextConfig model) - llava_next_video —
{'torchvision': 'LlavaNextImageProcessor', 'pil': 'LlavaNextImageProcessorPil'}(LlavaNextVideoConfig model) - llava_onevision —
{'pil': 'LlavaOnevisionImageProcessorPil', 'torchvision': 'LlavaOnevisionImageProcessor'}(LlavaOnevisionConfig model) - lw_detr —
{'torchvision': 'DeformableDetrImageProcessor', 'pil': 'DeformableDetrImageProcessorPil'}(LwDetrConfig model) - mask2former —
{'pil': 'Mask2FormerImageProcessorPil', 'torchvision': 'Mask2FormerImageProcessor'}(Mask2FormerConfig model) - maskformer —
{'pil': 'MaskFormerImageProcessorPil', 'torchvision': 'MaskFormerImageProcessor'}(MaskFormerConfig model) - metaclip_2 —
{'torchvision': 'CLIPImageProcessor', 'pil': 'CLIPImageProcessorPil'}(MetaClip2Config model) - mgp-str —
{'torchvision': 'ViTImageProcessor', 'pil': 'ViTImageProcessorPil'}(MgpstrConfig model) - mistral3 —
{'torchvision': 'PixtralImageProcessor', 'pil': 'PixtralImageProcessorPil'}(Mistral3Config model) - mlcd —
{'torchvision': 'CLIPImageProcessor', 'pil': 'CLIPImageProcessorPil'}(MLCDVisionConfig model) - mllama —
{'pil': 'MllamaImageProcessorPil', 'torchvision': 'MllamaImageProcessor'}(MllamaConfig model) - mm-grounding-dino —
{'torchvision': 'GroundingDinoImageProcessor', 'pil': 'GroundingDinoImageProcessorPil'}(MMGroundingDinoConfig model) - mobilenet_v1 —
{'pil': 'MobileNetV1ImageProcessorPil', 'torchvision': 'MobileNetV1ImageProcessor'}(MobileNetV1Config model) - mobilenet_v2 —
{'pil': 'MobileNetV2ImageProcessorPil', 'torchvision': 'MobileNetV2ImageProcessor'}(MobileNetV2Config model) - mobilevit —
{'pil': 'MobileViTImageProcessorPil', 'torchvision': 'MobileViTImageProcessor'}(MobileViTConfig model) - mobilevitv2 —
{'torchvision': 'MobileViTImageProcessor', 'pil': 'MobileViTImageProcessorPil'}(MobileViTV2Config model) - nougat —
{'pil': 'NougatImageProcessorPil', 'torchvision': 'NougatImageProcessor'}(NougatConfig model) - omdet-turbo —
{'torchvision': 'DetrImageProcessor', 'pil': 'DetrImageProcessorPil'}(OmDetTurboConfig model) - oneformer —
{'pil': 'OneFormerImageProcessorPil', 'torchvision': 'OneFormerImageProcessor'}(OneFormerConfig model) - ovis2 —
{'pil': 'Ovis2ImageProcessorPil', 'torchvision': 'Ovis2ImageProcessor'}(Ovis2Config model) - owlv2 —
{'pil': 'Owlv2ImageProcessorPil', 'torchvision': 'Owlv2ImageProcessor'}(Owlv2Config model) - owlvit —
{'pil': 'OwlViTImageProcessorPil', 'torchvision': 'OwlViTImageProcessor'}(OwlViTConfig model) - paddleocr_vl —
{'pil': 'PaddleOCRVLImageProcessorPil', 'torchvision': 'PaddleOCRVLImageProcessor'}(PaddleOCRVLConfig model) - paligemma —
{'torchvision': 'SiglipImageProcessor', 'pil': 'SiglipImageProcessorPil'}(PaliGemmaConfig model) - perceiver —
{'pil': 'PerceiverImageProcessorPil', 'torchvision': 'PerceiverImageProcessor'}(PerceiverConfig model) - perception_lm —
{'torchvision': 'PerceptionLMImageProcessor'}(PerceptionLMConfig model) - phi4_multimodal —
{'torchvision': 'Phi4MultimodalImageProcessor'}(Phi4MultimodalConfig model) - pi0 —
{'torchvision': 'PI0ImageProcessor'}(PI0Config model) - pix2struct —
{'pil': 'Pix2StructImageProcessorPil', 'torchvision': 'Pix2StructImageProcessor'}(Pix2StructConfig model) - pixio —
{'torchvision': 'BitImageProcessor', 'pil': 'BitImageProcessorPil'}(PixioConfig model) - pixtral —
{'pil': 'PixtralImageProcessorPil', 'torchvision': 'PixtralImageProcessor'}(PixtralVisionConfig model) - poolformer —
{'pil': 'PoolFormerImageProcessorPil', 'torchvision': 'PoolFormerImageProcessor'}(PoolFormerConfig model) - pp_chart2table —
{'pil': 'PPChart2TableImageProcessorPil', 'torchvision': 'PPChart2TableImageProcessor'}(PPChart2TableConfig model) - pp_doclayout_v2 —
{'torchvision': 'PPDocLayoutV2ImageProcessor'}(PPDocLayoutV2Config model) - pp_doclayout_v3 —
{'torchvision': 'PPDocLayoutV3ImageProcessor'}(PPDocLayoutV3Config model) - pp_lcnet —
{'torchvision': 'PPLCNetImageProcessor'}(PPLCNetConfig model) - pp_ocrv5_mobile_det —
{'torchvision': 'PPOCRV5ServerDetImageProcessor'}(PPOCRV5MobileDetConfig model) - pp_ocrv5_mobile_rec —
{'torchvision': 'PPOCRV5ServerRecImageProcessor'}(PPOCRV5MobileRecConfig model) - pp_ocrv5_server_det —
{'torchvision': 'PPOCRV5ServerDetImageProcessor'}(PPOCRV5ServerDetConfig model) - pp_ocrv5_server_rec —
{'torchvision': 'PPOCRV5ServerRecImageProcessor'}(PPOCRV5ServerRecConfig model) - prompt_depth_anything —
{'pil': 'PromptDepthAnythingImageProcessorPil', 'torchvision': 'PromptDepthAnythingImageProcessor'}(PromptDepthAnythingConfig model) - pvt —
{'pil': 'PvtImageProcessorPil', 'torchvision': 'PvtImageProcessor'}(PvtConfig model) - pvt_v2 —
{'torchvision': 'PvtImageProcessor', 'pil': 'PvtImageProcessorPil'}(PvtV2Config model) - qwen2_5_omni —
{'torchvision': 'Qwen2VLImageProcessor', 'pil': 'Qwen2VLImageProcessorPil'}(Qwen2_5OmniConfig model) - qwen2_5_vl —
{'torchvision': 'Qwen2VLImageProcessor', 'pil': 'Qwen2VLImageProcessorPil'}(Qwen2_5_VLConfig model) - qwen2_vl —
{'pil': 'Qwen2VLImageProcessorPil', 'torchvision': 'Qwen2VLImageProcessor'}(Qwen2VLConfig model) - qwen3_5 —
{'torchvision': 'Qwen2VLImageProcessor', 'pil': 'Qwen2VLImageProcessorPil'}(Qwen3_5Config model) - qwen3_5_moe —
{'torchvision': 'Qwen2VLImageProcessor', 'pil': 'Qwen2VLImageProcessorPil'}(Qwen3_5MoeConfig model) - qwen3_omni_moe —
{'torchvision': 'Qwen2VLImageProcessor', 'pil': 'Qwen2VLImageProcessorPil'}(Qwen3OmniMoeConfig model) - qwen3_vl —
{'torchvision': 'Qwen2VLImageProcessor', 'pil': 'Qwen2VLImageProcessorPil'}(Qwen3VLConfig model) - regnet —
{'torchvision': 'ConvNextImageProcessor', 'pil': 'ConvNextImageProcessorPil'}(RegNetConfig model) - resnet —
{'torchvision': 'ConvNextImageProcessor', 'pil': 'ConvNextImageProcessorPil'}(ResNetConfig model) - rt_detr —
{'pil': 'RTDetrImageProcessorPil', 'torchvision': 'RTDetrImageProcessor'}(RTDetrConfig model) - sam —
{'pil': 'SamImageProcessorPil', 'torchvision': 'SamImageProcessor'}(SamConfig model) - sam2 —
{'torchvision': 'Sam2ImageProcessor'}(Sam2Config model) - sam2_video —
{'torchvision': 'Sam2ImageProcessor'}(Sam2VideoConfig model) - sam3 —
{'torchvision': 'Sam3ImageProcessor'}(Sam3Config model) - sam3_lite_text —
{'torchvision': 'Sam3ImageProcessor'}(Sam3LiteTextConfig model) - sam3_tracker —
{'torchvision': 'Sam3ImageProcessor'}(Sam3TrackerConfig model) - sam3_tracker_video —
{'torchvision': 'Sam3ImageProcessor'}(Sam3TrackerVideoConfig model) - sam3_video —
{'torchvision': 'Sam3ImageProcessor'}(Sam3VideoConfig model) - sam_hq —
{'torchvision': 'SamImageProcessor', 'pil': 'SamImageProcessorPil'}(SamHQConfig model) - segformer —
{'pil': 'SegformerImageProcessorPil', 'torchvision': 'SegformerImageProcessor'}(SegformerConfig model) - seggpt —
{'pil': 'SegGptImageProcessorPil', 'torchvision': 'SegGptImageProcessor'}(SegGptConfig model) - shieldgemma2 —
{'torchvision': 'Gemma3ImageProcessor', 'pil': 'Gemma3ImageProcessorPil'}(ShieldGemma2Config model) - siglip —
{'pil': 'SiglipImageProcessorPil', 'torchvision': 'SiglipImageProcessor'}(SiglipConfig model) - siglip2 —
{'pil': 'Siglip2ImageProcessorPil', 'torchvision': 'Siglip2ImageProcessor'}(Siglip2Config model) - slanext —
{'torchvision': 'SLANeXtImageProcessor'}(SLANeXtConfig model) - smolvlm —
{'pil': 'SmolVLMImageProcessorPil', 'torchvision': 'SmolVLMImageProcessor'}(SmolVLMConfig model) - superglue —
{'pil': 'SuperGlueImageProcessorPil', 'torchvision': 'SuperGlueImageProcessor'}(SuperGlueConfig model) - superpoint —
{'pil': 'SuperPointImageProcessorPil', 'torchvision': 'SuperPointImageProcessor'}(SuperPointConfig model) - swiftformer —
{'torchvision': 'ViTImageProcessor', 'pil': 'ViTImageProcessorPil'}(SwiftFormerConfig model) - swin —
{'torchvision': 'ViTImageProcessor', 'pil': 'ViTImageProcessorPil'}(SwinConfig model) - swin2sr —
{'pil': 'Swin2SRImageProcessorPil', 'torchvision': 'Swin2SRImageProcessor'}(Swin2SRConfig model) - swinv2 —
{'torchvision': 'ViTImageProcessor', 'pil': 'ViTImageProcessorPil'}(Swinv2Config model) - t5gemma2 —
{'torchvision': 'Gemma3ImageProcessor', 'pil': 'Gemma3ImageProcessorPil'}(T5Gemma2Config model) - t5gemma2_encoder —
{'torchvision': 'Gemma3ImageProcessor', 'pil': 'Gemma3ImageProcessorPil'}(T5Gemma2EncoderConfig model) - table-transformer —
{'torchvision': 'DetrImageProcessor', 'pil': 'DetrImageProcessorPil'}(TableTransformerConfig model) - textnet —
{'pil': 'TextNetImageProcessorPil', 'torchvision': 'TextNetImageProcessor'}(TextNetConfig model) - timesformer —
{'pil': 'VideoMAEImageProcessorPil', 'torchvision': 'VideoMAEImageProcessor'}(TimesformerConfig model) - timm_wrapper —
{'pil': 'TimmWrapperImageProcessor'}(TimmWrapperConfig model) - trocr —
{'torchvision': 'ViTImageProcessor', 'pil': 'ViTImageProcessorPil'}(TrOCRConfig model) - tvp —
{'pil': 'TvpImageProcessorPil', 'torchvision': 'TvpImageProcessor'}(TvpConfig model) - udop —
{'torchvision': 'LayoutLMv3ImageProcessor', 'pil': 'LayoutLMv3ImageProcessorPil'}(UdopConfig model) - upernet —
{'torchvision': 'SegformerImageProcessor', 'pil': 'SegformerImageProcessorPil'}(UperNetConfig model) - uvdoc —
{'torchvision': 'UVDocImageProcessor'}(UVDocConfig model) - video_llama_3 —
{'pil': 'VideoLlama3ImageProcessorPil', 'torchvision': 'VideoLlama3ImageProcessor'}(VideoLlama3Config model) - video_llava —
{'pil': 'VideoLlavaImageProcessor'}(VideoLlavaConfig model) - videomae —
{'pil': 'VideoMAEImageProcessorPil', 'torchvision': 'VideoMAEImageProcessor'}(VideoMAEConfig model) - vilt —
{'pil': 'ViltImageProcessorPil', 'torchvision': 'ViltImageProcessor'}(ViltConfig model) - vipllava —
{'torchvision': 'CLIPImageProcessor', 'pil': 'CLIPImageProcessorPil'}(VipLlavaConfig model) - vit —
{'pil': 'ViTImageProcessorPil', 'torchvision': 'ViTImageProcessor'}(ViTConfig model) - vit_mae —
{'torchvision': 'ViTImageProcessor', 'pil': 'ViTImageProcessorPil'}(ViTMAEConfig model) - vit_msn —
{'torchvision': 'ViTImageProcessor', 'pil': 'ViTImageProcessorPil'}(ViTMSNConfig model) - vitmatte —
{'pil': 'VitMatteImageProcessorPil', 'torchvision': 'VitMatteImageProcessor'}(VitMatteConfig model) - vitpose —
{'pil': 'VitPoseImageProcessorPil', 'torchvision': 'VitPoseImageProcessor'}(VitPoseConfig model) - vivit —
{'torchvision': 'VivitImageProcessor'}(VivitConfig model) - xclip —
{'torchvision': 'CLIPImageProcessor', 'pil': 'CLIPImageProcessorPil'}(XCLIPConfig model) - yolos —
{'pil': 'YolosImageProcessorPil', 'torchvision': 'YolosImageProcessor'}(YolosConfig model) - zoedepth —
{'pil': 'ZoeDepthImageProcessorPil', 'torchvision': 'ZoeDepthImageProcessor'}(ZoeDepthConfig model)
Passing
token=Trueis required when you want to use a private model.
Examples:
>>> from transformers import AutoImageProcessor
>>> # Download image processor from huggingface.co and cache.
>>> image_processor = AutoImageProcessor.from_pretrained("google/vit-base-patch16-224-in21k")
>>> # If image processor files are in a directory (e.g. image processor was saved using *save_pretrained('./test/saved_model/')*)
>>> # image_processor = AutoImageProcessor.from_pretrained("./test/saved_model/")register
< source >( config_class slow_image_processor_class: type | None = None fast_image_processor_class: type | None = None image_processor_classes: dict[str, type] | None = None exist_ok: bool = False )
Parameters
- config_class (PreTrainedConfig) — The configuration corresponding to the model to register.
- slow_image_processor_class (
type, optional) — The PIL backend image processor class (deprecated, useimage_processor_classes={"pil": ...}). - fast_image_processor_class (
type, optional) — The Torchvision backend image processor class (deprecated, useimage_processor_classes={"torchvision": ...}). - image_processor_classes (
dict[str, type], optional) — Dictionary mapping backend names to image processor classes. Allows registering custom backends. Example:{"pil": MyPilProcessor, "torchvision": MyTorchvisionProcessor, "custom": MyCustomProcessor} - exist_ok (
bool, optional, defaults toFalse) — IfTrue, allow overwriting existing registrations.
Register a new image processor for this class.
AutoVideoProcessor
This is a generic video processor class that will be instantiated as one of the video processor classes of the library when created with the AutoVideoProcessor.from_pretrained() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_pretrained
< source >( pretrained_model_name_or_path *inputs **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — This can be either:- a string, the model id of a pretrained video_processor hosted inside a model repo on huggingface.co.
- a path to a directory containing a video processor file saved using the
save_pretrained() method, e.g.,
./my_model_directory/. - a path to a saved video processor JSON file, e.g.,
./my_model_directory/preprocessor_config.json.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model video processor should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force to (re-)download the video processor files and override the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}.The proxies are used on each request. - token (
stror bool, optional) — The token to use as HTTP bearer authorization for remote files. IfTrue, will use the token generated when runninghf auth login(stored in~/.huggingface). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - return_unused_kwargs (
bool, optional, defaults toFalse) — IfFalse, then this function returns just the final video processor object. IfTrue, then this functions returns aTuple(video_processor, unused_kwargs)where unused_kwargs is a dictionary consisting of the key/value pairs whose keys are not video processor attributes: i.e., the part ofkwargswhich has not been used to updatevideo_processorand is otherwise ignored. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - kwargs (
dict[str, Any], optional) — The values in kwargs of any keys which are video processor attributes will be used to override the loaded values. Behavior concerning key/value pairs whose keys are not video processor attributes is controlled by thereturn_unused_kwargskeyword parameter.
Instantiate one of the video processor classes of the library from a pretrained model vocabulary.
The video processor class to instantiate is selected based on the model_type property of the config object
(either passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s
missing, by falling back to using pattern matching on pretrained_model_name_or_path:
- ernie4_5_vl_moe — Ernie4_5_VLMoeVideoProcessor (Ernie4_5_VLMoeConfig model)
- gemma4 — Gemma4VideoProcessor (Gemma4Config model)
- glm46v — Glm46VVideoProcessor (Glm46VConfig model)
- glm4v — Glm4vVideoProcessor (Glm4vConfig model)
- instructblip — InstructBlipVideoVideoProcessor (InstructBlipConfig model)
- instructblipvideo — InstructBlipVideoVideoProcessor (InstructBlipVideoConfig model)
- internvl — InternVLVideoProcessor (InternVLConfig model)
- llava_next_video — LlavaNextVideoVideoProcessor (LlavaNextVideoConfig model)
- llava_onevision — LlavaOnevisionVideoProcessor (LlavaOnevisionConfig model)
- pe_audio_video — PeVideoVideoProcessor (PeAudioVideoConfig model)
- pe_video — PeVideoVideoProcessor (PeVideoConfig model)
- perception_lm — PerceptionLMVideoProcessor (PerceptionLMConfig model)
- qwen2_5_omni — Qwen2VLVideoProcessor (Qwen2_5OmniConfig model)
- qwen2_5_vl — Qwen2VLVideoProcessor (Qwen2_5_VLConfig model)
- qwen2_vl — Qwen2VLVideoProcessor (Qwen2VLConfig model)
- qwen3_5 — Qwen3VLVideoProcessor (Qwen3_5Config model)
- qwen3_5_moe — Qwen3VLVideoProcessor (Qwen3_5MoeConfig model)
- qwen3_omni_moe — Qwen2VLVideoProcessor (Qwen3OmniMoeConfig model)
- qwen3_vl — Qwen3VLVideoProcessor (Qwen3VLConfig model)
- qwen3_vl_moe — Qwen3VLVideoProcessor (Qwen3VLMoeConfig model)
- sam2_video — Sam2VideoVideoProcessor (Sam2VideoConfig model)
- smolvlm — SmolVLMVideoProcessor (SmolVLMConfig model)
- video_llama_3 — VideoLlama3VideoProcessor (VideoLlama3Config model)
- video_llava — VideoLlavaVideoProcessor (VideoLlavaConfig model)
- videomae — VideoMAEVideoProcessor (VideoMAEConfig model)
- videomt — VideomtVideoProcessor (VideomtConfig model)
- vjepa2 — VJEPA2VideoProcessor (VJEPA2Config model)
Passing
token=Trueis required when you want to use a private model.
Examples:
>>> from transformers import AutoVideoProcessor
>>> # Download video processor from huggingface.co and cache.
>>> video_processor = AutoVideoProcessor.from_pretrained("llava-hf/llava-onevision-qwen2-0.5b-ov-hf")
>>> # If video processor files are in a directory (e.g. video processor was saved using *save_pretrained('./test/saved_model/')*)
>>> # video_processor = AutoVideoProcessor.from_pretrained("./test/saved_model/")register
< source >( config_class video_processor_class exist_ok = False )
Parameters
- config_class (PreTrainedConfig) — The configuration corresponding to the model to register.
- video_processor_class (BaseVideoProcessor) — The video processor to register.
Register a new video processor for this class.
AutoProcessor
This is a generic processor class that will be instantiated as one of the processor classes of the library when created with the AutoProcessor.from_pretrained() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_pretrained
< source >( pretrained_model_name_or_path **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — This can be either:- a string, the model id of a pretrained feature_extractor hosted inside a model repo on huggingface.co.
- a path to a directory containing a processor files saved using the
save_pretrained()method, e.g.,./my_model_directory/.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model feature extractor should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force to (re-)download the feature extractor files and override the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}.The proxies are used on each request. - token (
stror bool, optional) — The token to use as HTTP bearer authorization for remote files. IfTrue, will use the token generated when runninghf auth login(stored in~/.huggingface). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - return_unused_kwargs (
bool, optional, defaults toFalse) — IfFalse, then this function returns just the final feature extractor object. IfTrue, then this functions returns aTuple(feature_extractor, unused_kwargs)where unused_kwargs is a dictionary consisting of the key/value pairs whose keys are not feature extractor attributes: i.e., the part ofkwargswhich has not been used to updatefeature_extractorand is otherwise ignored. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - kwargs (
dict[str, Any], optional) — The values in kwargs of any keys which are feature extractor attributes will be used to override the loaded values. Behavior concerning key/value pairs whose keys are not feature extractor attributes is controlled by thereturn_unused_kwargskeyword parameter.
Instantiate one of the processor classes of the library from a pretrained model vocabulary.
The processor class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible):
- aimv2 — CLIPProcessor (Aimv2Config model)
- align — AlignProcessor (AlignConfig model)
- altclip — AltCLIPProcessor (AltCLIPConfig model)
- aria — AriaProcessor (AriaConfig model)
- audioflamingo3 — AudioFlamingo3Processor (AudioFlamingo3Config model)
- aya_vision — AyaVisionProcessor (AyaVisionConfig model)
- bark — BarkProcessor (BarkConfig model)
- blip — BlipProcessor (BlipConfig model)
- blip-2 — Blip2Processor (Blip2Config model)
- bridgetower — BridgeTowerProcessor (BridgeTowerConfig model)
- chameleon — ChameleonProcessor (ChameleonConfig model)
- chinese_clip — ChineseCLIPProcessor (ChineseCLIPConfig model)
- clap — ClapProcessor (ClapConfig model)
- clip — CLIPProcessor (CLIPConfig model)
- clipseg — CLIPSegProcessor (CLIPSegConfig model)
- clvp — ClvpProcessor (ClvpConfig model)
- cohere2_vision — Cohere2VisionProcessor (Cohere2VisionConfig model)
- cohere_asr — CohereAsrProcessor (CohereAsrConfig model)
- colmodernvbert — ColModernVBertProcessor (ColModernVBertConfig model)
- colpali — ColPaliProcessor (ColPaliConfig model)
- colqwen2 — ColQwen2Processor (ColQwen2Config model)
- deepseek_vl — DeepseekVLProcessor (DeepseekVLConfig model)
- deepseek_vl_hybrid — DeepseekVLHybridProcessor (DeepseekVLHybridConfig model)
- dia — DiaProcessor (DiaConfig model)
- edgetam — Sam2Processor (EdgeTamConfig model)
- emu3 — Emu3Processor (Emu3Config model)
- ernie4_5_vl_moe — Ernie4_5_VLMoeProcessor (Ernie4_5_VLMoeConfig model)
- evolla — EvollaProcessor (EvollaConfig model)
- flava — FlavaProcessor (FlavaConfig model)
- florence2 — Florence2Processor (Florence2Config model)
- fuyu — FuyuProcessor (FuyuConfig model)
- gemma3 — Gemma3Processor (Gemma3Config model)
- gemma3n — Gemma3nProcessor (Gemma3nConfig model)
- gemma4 — Gemma4Processor (Gemma4Config model)
- git — GitProcessor (GitConfig model)
- glm46v — Glm46VProcessor (Glm46VConfig model)
- glm4v — Glm4vProcessor (Glm4vConfig model)
- glm4v_moe — Glm4vProcessor (Glm4vMoeConfig model)
- glm_image — Glm4vProcessor (GlmImageConfig model)
- glmasr — GlmAsrProcessor (GlmAsrConfig model)
- got_ocr2 — GotOcr2Processor (GotOcr2Config model)
- granite_speech — GraniteSpeechProcessor (GraniteSpeechConfig model)
- grounding-dino — GroundingDinoProcessor (GroundingDinoConfig model)
- groupvit — CLIPProcessor (GroupViTConfig model)
- higgs_audio_v2 — HiggsAudioV2Processor (HiggsAudioV2Config model)
- hubert — Wav2Vec2Processor (HubertConfig model)
- idefics — IdeficsProcessor (IdeficsConfig model)
- idefics2 — Idefics2Processor (Idefics2Config model)
- idefics3 — Idefics3Processor (Idefics3Config model)
- instructblip — InstructBlipProcessor (InstructBlipConfig model)
- instructblipvideo — InstructBlipVideoProcessor (InstructBlipVideoConfig model)
- internvl — InternVLProcessor (InternVLConfig model)
- janus — JanusProcessor (JanusConfig model)
- kosmos-2 — Kosmos2Processor (Kosmos2Config model)
- kosmos-2.5 — Kosmos2_5Processor (Kosmos2_5Config model)
- kyutai_speech_to_text — KyutaiSpeechToTextProcessor (KyutaiSpeechToTextConfig model)
- lasr_ctc — LasrProcessor (LasrCTCConfig model)
- lasr_encoder — LasrProcessor (LasrEncoderConfig model)
- layoutlmv2 — LayoutLMv2Processor (LayoutLMv2Config model)
- layoutlmv3 — LayoutLMv3Processor (LayoutLMv3Config model)
- layoutxlm — LayoutXLMProcessor (LayoutXLMConfig model)
- lfm2_vl — Lfm2VlProcessor (Lfm2VlConfig model)
- lighton_ocr — LightOnOcrProcessor (LightOnOcrConfig model)
- llama4 — Llama4Processor (Llama4Config model)
- llava — LlavaProcessor (LlavaConfig model)
- llava_next — LlavaNextProcessor (LlavaNextConfig model)
- llava_next_video — LlavaNextVideoProcessor (LlavaNextVideoConfig model)
- llava_onevision — LlavaOnevisionProcessor (LlavaOnevisionConfig model)
- markuplm — MarkupLMProcessor (MarkupLMConfig model)
- metaclip_2 — CLIPProcessor (MetaClip2Config model)
- mgp-str — MgpstrProcessor (MgpstrConfig model)
- mistral3 — PixtralProcessor (Mistral3Config model)
- mllama — MllamaProcessor (MllamaConfig model)
- mm-grounding-dino — GroundingDinoProcessor (MMGroundingDinoConfig model)
- modernvbert — Idefics3Processor (ModernVBertConfig model)
- moonshine — Wav2Vec2Processor (MoonshineConfig model)
- moonshine_streaming — MoonshineStreamingProcessor (MoonshineStreamingConfig model)
- musicflamingo — MusicFlamingoProcessor (MusicFlamingoConfig model)
- omdet-turbo — OmDetTurboProcessor (OmDetTurboConfig model)
- oneformer — OneFormerProcessor (OneFormerConfig model)
- ovis2 — Ovis2Processor (Ovis2Config model)
- owlv2 — Owlv2Processor (Owlv2Config model)
- owlvit — OwlViTProcessor (OwlViTConfig model)
- paddleocr_vl — PaddleOCRVLProcessor (PaddleOCRVLConfig model)
- paligemma — PaliGemmaProcessor (PaliGemmaConfig model)
- perception_lm — PerceptionLMProcessor (PerceptionLMConfig model)
- phi4_multimodal — Phi4MultimodalProcessor (Phi4MultimodalConfig model)
- pi0 — PI0Processor (PI0Config model)
- pix2struct — Pix2StructProcessor (Pix2StructConfig model)
- pixtral — PixtralProcessor (PixtralVisionConfig model)
- pop2piano — Pop2PianoProcessor (Pop2PianoConfig model)
- pp_chart2table — PPChart2TableProcessor (PPChart2TableConfig model)
- qwen2_5_omni — Qwen2_5OmniProcessor (Qwen2_5OmniConfig model)
- qwen2_5_vl — Qwen2_5_VLProcessor (Qwen2_5_VLConfig model)
- qwen2_audio — Qwen2AudioProcessor (Qwen2AudioConfig model)
- qwen2_vl — Qwen2VLProcessor (Qwen2VLConfig model)
- qwen3_5 — Qwen3VLProcessor (Qwen3_5Config model)
- qwen3_5_moe — Qwen3VLProcessor (Qwen3_5MoeConfig model)
- qwen3_omni_moe — Qwen3OmniMoeProcessor (Qwen3OmniMoeConfig model)
- qwen3_vl — Qwen3VLProcessor (Qwen3VLConfig model)
- qwen3_vl_moe — Qwen3VLProcessor (Qwen3VLMoeConfig model)
- sam — SamProcessor (SamConfig model)
- sam2 — Sam2Processor (Sam2Config model)
- sam3 — Sam3Processor (Sam3Config model)
- sam3_lite_text — Sam3Processor (Sam3LiteTextConfig model)
- sam_hq — SamHQProcessor (SamHQConfig model)
- seamless_m4t — SeamlessM4TProcessor (SeamlessM4TConfig model)
- sew — Wav2Vec2Processor (SEWConfig model)
- sew-d — Wav2Vec2Processor (SEWDConfig model)
- shieldgemma2 — ShieldGemma2Processor (ShieldGemma2Config model)
- siglip — SiglipProcessor (SiglipConfig model)
- siglip2 — Siglip2Processor (Siglip2Config model)
- smolvlm — SmolVLMProcessor (SmolVLMConfig model)
- speech_to_text — Speech2TextProcessor (Speech2TextConfig model)
- speecht5 — SpeechT5Processor (SpeechT5Config model)
- t5gemma2 — Gemma3Processor (T5Gemma2Config model)
- t5gemma2_encoder — Gemma3Processor (T5Gemma2EncoderConfig model)
- trocr — TrOCRProcessor (TrOCRConfig model)
- tvp — TvpProcessor (TvpConfig model)
- udop — UdopProcessor (UdopConfig model)
- unispeech — Wav2Vec2Processor (UniSpeechConfig model)
- unispeech-sat — Wav2Vec2Processor (UniSpeechSatConfig model)
- vibevoice_asr — VibeVoiceAsrProcessor (VibeVoiceAsrConfig model)
- video_llava — VideoLlavaProcessor (VideoLlavaConfig model)
- vilt — ViltProcessor (ViltConfig model)
- vipllava — LlavaProcessor (VipLlavaConfig model)
- vision-text-dual-encoder — VisionTextDualEncoderProcessor (VisionTextDualEncoderConfig model)
- voxtral — VoxtralProcessor (VoxtralConfig model)
- voxtral_realtime — VoxtralRealtimeProcessor (VoxtralRealtimeConfig model)
- wav2vec2 — Wav2Vec2Processor (Wav2Vec2Config model)
- wav2vec2-bert — Wav2Vec2Processor (Wav2Vec2BertConfig model)
- wav2vec2-conformer — Wav2Vec2Processor (Wav2Vec2ConformerConfig model)
- wavlm — Wav2Vec2Processor (WavLMConfig model)
- whisper — WhisperProcessor (WhisperConfig model)
- xclip — XCLIPProcessor (XCLIPConfig model)
Passing
token=Trueis required when you want to use a private model.
Examples:
>>> from transformers import AutoProcessor
>>> # Download processor from huggingface.co and cache.
>>> processor = AutoProcessor.from_pretrained("facebook/wav2vec2-base-960h")
>>> # If processor files are in a directory (e.g. processor was saved using *save_pretrained('./test/saved_model/')*)
>>> # processor = AutoProcessor.from_pretrained("./test/saved_model/")register
< source >( config_class processor_class exist_ok = False )
Parameters
- config_class (PreTrainedConfig) — The configuration corresponding to the model to register.
- processor_class (ProcessorMixin) — The processor to register.
Register a new processor for this class.
Generic model classes
The following auto classes are available for instantiating a base model class without a specific head.
AutoModel
This is a generic model class that will be instantiated as one of the base model classes of the library when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- ASTConfig configuration class: ASTModel (ASTConfig model)
- AfmoeConfig configuration class: AfmoeModel (AfmoeConfig model)
- Aimv2Config configuration class: Aimv2Model (Aimv2Config model)
- Aimv2VisionConfig configuration class: Aimv2VisionModel (Aimv2VisionConfig model)
- AlbertConfig configuration class:
AlbertModel(AlbertConfig model) - AlignConfig configuration class: AlignModel (AlignConfig model)
- AltCLIPConfig configuration class: AltCLIPModel (AltCLIPConfig model)
- ApertusConfig configuration class: ApertusModel (ApertusConfig model)
- ArceeConfig configuration class: ArceeModel (ArceeConfig model)
- AriaConfig configuration class: AriaModel (AriaConfig model)
- AriaTextConfig configuration class: AriaTextModel (AriaTextConfig model)
- AudioFlamingo3Config configuration class: AudioFlamingo3ForConditionalGeneration (AudioFlamingo3Config model)
- AudioFlamingo3EncoderConfig configuration class: AudioFlamingo3Encoder (AudioFlamingo3EncoderConfig model)
- AutoformerConfig configuration class: AutoformerModel (AutoformerConfig model)
- AyaVisionConfig configuration class: AyaVisionModel (AyaVisionConfig model)
- BambaConfig configuration class: BambaModel (BambaConfig model)
- BarkConfig configuration class: BarkModel (BarkConfig model)
- BartConfig configuration class: BartModel (BartConfig model)
- BeitConfig configuration class: BeitModel (BeitConfig model)
- BertConfig configuration class: BertModel (BertConfig model)
- BertGenerationConfig configuration class: BertGenerationEncoder (BertGenerationConfig model)
- BigBirdConfig configuration class: BigBirdModel (BigBirdConfig model)
- BigBirdPegasusConfig configuration class: BigBirdPegasusModel (BigBirdPegasusConfig model)
- BioGptConfig configuration class: BioGptModel (BioGptConfig model)
- BitConfig configuration class: BitModel (BitConfig model)
- BitNetConfig configuration class: BitNetModel (BitNetConfig model)
- BlenderbotConfig configuration class: BlenderbotModel (BlenderbotConfig model)
- BlenderbotSmallConfig configuration class: BlenderbotSmallModel (BlenderbotSmallConfig model)
- Blip2Config configuration class: Blip2Model (Blip2Config model)
- Blip2QFormerConfig configuration class: Blip2QFormerModel (Blip2QFormerConfig model)
- BlipConfig configuration class: BlipModel (BlipConfig model)
- BloomConfig configuration class: BloomModel (BloomConfig model)
- BltConfig configuration class: BltModel (BltConfig model)
- BridgeTowerConfig configuration class: BridgeTowerModel (BridgeTowerConfig model)
- BrosConfig configuration class: BrosModel (BrosConfig model)
- CLIPConfig configuration class: CLIPModel (CLIPConfig model)
- CLIPSegConfig configuration class: CLIPSegModel (CLIPSegConfig model)
- CLIPTextConfig configuration class: CLIPTextModel (CLIPTextConfig model)
- CLIPVisionConfig configuration class: CLIPVisionModel (CLIPVisionConfig model)
- CTRLConfig configuration class: CTRLModel (CTRLConfig model)
- CamembertConfig configuration class: CamembertModel (CamembertConfig model)
- CanineConfig configuration class: CanineModel (CanineConfig model)
- ChameleonConfig configuration class: ChameleonModel (ChameleonConfig model)
- ChineseCLIPConfig configuration class: ChineseCLIPModel (ChineseCLIPConfig model)
- ChineseCLIPVisionConfig configuration class: ChineseCLIPVisionModel (ChineseCLIPVisionConfig model)
- ClapConfig configuration class: ClapModel (ClapConfig model)
- ClvpConfig configuration class: ClvpModelForConditionalGeneration (ClvpConfig model)
- CodeGenConfig configuration class: CodeGenModel (CodeGenConfig model)
- Cohere2Config configuration class: Cohere2Model (Cohere2Config model)
- Cohere2VisionConfig configuration class: Cohere2VisionModel (Cohere2VisionConfig model)
- CohereAsrConfig configuration class: CohereAsrModel (CohereAsrConfig model)
- CohereConfig configuration class: CohereModel (CohereConfig model)
- ConditionalDetrConfig configuration class: ConditionalDetrModel (ConditionalDetrConfig model)
- ConvBertConfig configuration class: ConvBertModel (ConvBertConfig model)
- ConvNextConfig configuration class: ConvNextModel (ConvNextConfig model)
- ConvNextV2Config configuration class: ConvNextV2Model (ConvNextV2Config model)
- CpmAntConfig configuration class: CpmAntModel (CpmAntConfig model)
- CsmConfig configuration class: CsmForConditionalGeneration (CsmConfig model)
- CvtConfig configuration class: CvtModel (CvtConfig model)
- CwmConfig configuration class: CwmModel (CwmConfig model)
- DFineConfig configuration class: DFineModel (DFineConfig model)
- DINOv3ConvNextConfig configuration class: DINOv3ConvNextModel (DINOv3ConvNextConfig model)
- DINOv3ViTConfig configuration class: DINOv3ViTModel (DINOv3ViTConfig model)
- DPRConfig configuration class: DPRQuestionEncoder (DPRConfig model)
- DPTConfig configuration class: DPTModel (DPTConfig model)
- DabDetrConfig configuration class: DabDetrModel (DabDetrConfig model)
- DacConfig configuration class: DacModel (DacConfig model)
- Data2VecAudioConfig configuration class: Data2VecAudioModel (Data2VecAudioConfig model)
- Data2VecTextConfig configuration class: Data2VecTextModel (Data2VecTextConfig model)
- Data2VecVisionConfig configuration class: Data2VecVisionModel (Data2VecVisionConfig model)
- DbrxConfig configuration class: DbrxModel (DbrxConfig model)
- DebertaConfig configuration class: DebertaModel (DebertaConfig model)
- DebertaV2Config configuration class: DebertaV2Model (DebertaV2Config model)
- DecisionTransformerConfig configuration class: DecisionTransformerModel (DecisionTransformerConfig model)
- DeepseekV2Config configuration class: DeepseekV2Model (DeepseekV2Config model)
- DeepseekV3Config configuration class: DeepseekV3Model (DeepseekV3Config model)
- DeepseekVLConfig configuration class: DeepseekVLModel (DeepseekVLConfig model)
- DeepseekVLHybridConfig configuration class: DeepseekVLHybridModel (DeepseekVLHybridConfig model)
- DeformableDetrConfig configuration class: DeformableDetrModel (DeformableDetrConfig model)
- DeiTConfig configuration class: DeiTModel (DeiTConfig model)
- DepthProConfig configuration class: DepthProModel (DepthProConfig model)
- DetrConfig configuration class: DetrModel (DetrConfig model)
- DiaConfig configuration class: DiaModel (DiaConfig model)
- DiffLlamaConfig configuration class: DiffLlamaModel (DiffLlamaConfig model)
- DinatConfig configuration class: DinatModel (DinatConfig model)
- Dinov2Config configuration class: Dinov2Model (Dinov2Config model)
- Dinov2WithRegistersConfig configuration class: Dinov2WithRegistersModel (Dinov2WithRegistersConfig model)
- DistilBertConfig configuration class: DistilBertModel (DistilBertConfig model)
- DogeConfig configuration class: DogeModel (DogeConfig model)
- DonutSwinConfig configuration class: DonutSwinModel (DonutSwinConfig model)
- Dots1Config configuration class: Dots1Model (Dots1Config model)
- EdgeTamConfig configuration class: EdgeTamModel (EdgeTamConfig model)
- EdgeTamVideoConfig configuration class: EdgeTamVideoModel (EdgeTamVideoConfig model)
- EdgeTamVisionConfig configuration class: EdgeTamVisionModel (EdgeTamVisionConfig model)
- EfficientLoFTRConfig configuration class: EfficientLoFTRModel (EfficientLoFTRConfig model)
- EfficientNetConfig configuration class: EfficientNetModel (EfficientNetConfig model)
- ElectraConfig configuration class: ElectraModel (ElectraConfig model)
- Emu3Config configuration class: Emu3Model (Emu3Config model)
- EncodecConfig configuration class: EncodecModel (EncodecConfig model)
- Ernie4_5Config configuration class: Ernie4_5Model (Ernie4_5Config model)
- Ernie4_5_MoeConfig configuration class: Ernie4_5_MoeModel (Ernie4_5_MoeConfig model)
- Ernie4_5_VLMoeConfig configuration class: Ernie4_5_VLMoeModel (Ernie4_5_VLMoeConfig model)
- ErnieConfig configuration class: ErnieModel (ErnieConfig model)
- EsmConfig configuration class: EsmModel (EsmConfig model)
- EuroBertConfig configuration class: EuroBertModel (EuroBertConfig model)
- EvollaConfig configuration class: EvollaModel (EvollaConfig model)
- Exaone4Config configuration class: Exaone4Model (Exaone4Config model)
- ExaoneMoeConfig configuration class: ExaoneMoeModel (ExaoneMoeConfig model)
- FNetConfig configuration class: FNetModel (FNetConfig model)
- FSMTConfig configuration class: FSMTModel (FSMTConfig model)
- FalconConfig configuration class: FalconModel (FalconConfig model)
- FalconH1Config configuration class: FalconH1Model (FalconH1Config model)
- FalconMambaConfig configuration class: FalconMambaModel (FalconMambaConfig model)
- FastSpeech2ConformerConfig configuration class: FastSpeech2ConformerModel (FastSpeech2ConformerConfig model)
- FastSpeech2ConformerWithHifiGanConfig configuration class: FastSpeech2ConformerWithHifiGan (FastSpeech2ConformerWithHifiGanConfig model)
- FastVlmConfig configuration class: FastVlmModel (FastVlmConfig model)
- FlaubertConfig configuration class: FlaubertModel (FlaubertConfig model)
- FlavaConfig configuration class: FlavaModel (FlavaConfig model)
- FlexOlmoConfig configuration class: FlexOlmoModel (FlexOlmoConfig model)
- Florence2Config configuration class: Florence2Model (Florence2Config model)
- FocalNetConfig configuration class: FocalNetModel (FocalNetConfig model)
- FunnelConfig configuration class: FunnelModel or FunnelBaseModel (FunnelConfig model)
- FuyuConfig configuration class: FuyuModel (FuyuConfig model)
- GLPNConfig configuration class: GLPNModel (GLPNConfig model)
- GPT2Config configuration class: GPT2Model (GPT2Config model)
- GPTBigCodeConfig configuration class: GPTBigCodeModel (GPTBigCodeConfig model)
- GPTJConfig configuration class: GPTJModel (GPTJConfig model)
- GPTNeoConfig configuration class: GPTNeoModel (GPTNeoConfig model)
- GPTNeoXConfig configuration class: GPTNeoXModel (GPTNeoXConfig model)
- GPTNeoXJapaneseConfig configuration class: GPTNeoXJapaneseModel (GPTNeoXJapaneseConfig model)
- Gemma2Config configuration class: Gemma2Model (Gemma2Config model)
- Gemma3Config configuration class: Gemma3Model (Gemma3Config model)
- Gemma3TextConfig configuration class: Gemma3TextModel (Gemma3TextConfig model)
- Gemma3nAudioConfig configuration class:
Gemma3nAudioEncoder(Gemma3nAudioConfig model) - Gemma3nConfig configuration class: Gemma3nModel (Gemma3nConfig model)
- Gemma3nTextConfig configuration class: Gemma3nTextModel (Gemma3nTextConfig model)
- Gemma3nVisionConfig configuration class: TimmWrapperModel (Gemma3nVisionConfig model)
- Gemma4AudioConfig configuration class: Gemma4AudioModel (Gemma4AudioConfig model)
- Gemma4Config configuration class: Gemma4Model (Gemma4Config model)
- Gemma4TextConfig configuration class: Gemma4TextModel (Gemma4TextConfig model)
- Gemma4VisionConfig configuration class: Gemma4VisionModel (Gemma4VisionConfig model)
- GemmaConfig configuration class: GemmaModel (GemmaConfig model)
- GitConfig configuration class: GitModel (GitConfig model)
- Glm46VConfig configuration class: Glm46VModel (Glm46VConfig model)
- Glm4Config configuration class: Glm4Model (Glm4Config model)
- Glm4MoeConfig configuration class: Glm4MoeModel (Glm4MoeConfig model)
- Glm4MoeLiteConfig configuration class: Glm4MoeLiteModel (Glm4MoeLiteConfig model)
- Glm4vConfig configuration class: Glm4vModel (Glm4vConfig model)
- Glm4vMoeConfig configuration class: Glm4vMoeModel (Glm4vMoeConfig model)
- Glm4vMoeTextConfig configuration class: Glm4vMoeTextModel (Glm4vMoeTextConfig model)
- Glm4vMoeVisionConfig configuration class: Glm4vMoeVisionModel (Glm4vMoeVisionConfig model)
- Glm4vTextConfig configuration class: Glm4vTextModel (Glm4vTextConfig model)
- Glm4vVisionConfig configuration class: Glm4vVisionModel (Glm4vVisionConfig model)
- GlmAsrConfig configuration class: GlmAsrForConditionalGeneration (GlmAsrConfig model)
- GlmAsrEncoderConfig configuration class: GlmAsrEncoder (GlmAsrEncoderConfig model)
- GlmConfig configuration class: GlmModel (GlmConfig model)
- GlmImageConfig configuration class: GlmImageModel (GlmImageConfig model)
- GlmImageTextConfig configuration class: GlmImageTextModel (GlmImageTextConfig model)
- GlmImageVQVAEConfig configuration class: GlmImageVQVAE (GlmImageVQVAEConfig model)
- GlmImageVisionConfig configuration class: GlmImageVisionModel (GlmImageVisionConfig model)
- GlmMoeDsaConfig configuration class: GlmMoeDsaModel (GlmMoeDsaConfig model)
- GlmOcrConfig configuration class: GlmOcrModel (GlmOcrConfig model)
- GlmOcrTextConfig configuration class: GlmOcrTextModel (GlmOcrTextConfig model)
- GlmOcrVisionConfig configuration class: GlmOcrVisionModel (GlmOcrVisionConfig model)
- GotOcr2Config configuration class: GotOcr2Model (GotOcr2Config model)
- GptOssConfig configuration class: GptOssModel (GptOssConfig model)
- GraniteConfig configuration class: GraniteModel (GraniteConfig model)
- GraniteMoeConfig configuration class: GraniteMoeModel (GraniteMoeConfig model)
- GraniteMoeHybridConfig configuration class: GraniteMoeHybridModel (GraniteMoeHybridConfig model)
- GraniteMoeSharedConfig configuration class: GraniteMoeSharedModel (GraniteMoeSharedConfig model)
- GroundingDinoConfig configuration class: GroundingDinoModel (GroundingDinoConfig model)
- GroupViTConfig configuration class: GroupViTModel (GroupViTConfig model)
- HGNetV2Config configuration class: HGNetV2Backbone (HGNetV2Config model)
- HeliumConfig configuration class: HeliumModel (HeliumConfig model)
- HieraConfig configuration class: HieraModel (HieraConfig model)
- HiggsAudioV2Config configuration class: HiggsAudioV2ForConditionalGeneration (HiggsAudioV2Config model)
- HiggsAudioV2TokenizerConfig configuration class: HiggsAudioV2TokenizerModel (HiggsAudioV2TokenizerConfig model)
- HubertConfig configuration class: HubertModel (HubertConfig model)
- HunYuanDenseV1Config configuration class: HunYuanDenseV1Model (HunYuanDenseV1Config model)
- HunYuanMoEV1Config configuration class: HunYuanMoEV1Model (HunYuanMoEV1Config model)
- IBertConfig configuration class: IBertModel (IBertConfig model)
- IJepaConfig configuration class: IJepaModel (IJepaConfig model)
- Idefics2Config configuration class: Idefics2Model (Idefics2Config model)
- Idefics3Config configuration class: Idefics3Model (Idefics3Config model)
- Idefics3VisionConfig configuration class: Idefics3VisionTransformer (Idefics3VisionConfig model)
- IdeficsConfig configuration class: IdeficsModel (IdeficsConfig model)
- ImageGPTConfig configuration class: ImageGPTModel (ImageGPTConfig model)
- InformerConfig configuration class: InformerModel (InformerConfig model)
- InstructBlipConfig configuration class: InstructBlipModel (InstructBlipConfig model)
- InstructBlipVideoConfig configuration class: InstructBlipVideoModel (InstructBlipVideoConfig model)
- InternVLConfig configuration class: InternVLModel (InternVLConfig model)
- InternVLVisionConfig configuration class: InternVLVisionModel (InternVLVisionConfig model)
- Jais2Config configuration class: Jais2Model (Jais2Config model)
- JambaConfig configuration class: JambaModel (JambaConfig model)
- JanusConfig configuration class: JanusModel (JanusConfig model)
- JetMoeConfig configuration class: JetMoeModel (JetMoeConfig model)
- JinaEmbeddingsV3Config configuration class: JinaEmbeddingsV3Model (JinaEmbeddingsV3Config model)
- Kosmos2Config configuration class: Kosmos2Model (Kosmos2Config model)
- Kosmos2_5Config configuration class: Kosmos2_5Model (Kosmos2_5Config model)
- KyutaiSpeechToTextConfig configuration class: KyutaiSpeechToTextModel (KyutaiSpeechToTextConfig model)
- LEDConfig configuration class: LEDModel (LEDConfig model)
- LasrCTCConfig configuration class: LasrForCTC (LasrCTCConfig model)
- LasrEncoderConfig configuration class: LasrEncoder (LasrEncoderConfig model)
- LayoutLMConfig configuration class: LayoutLMModel (LayoutLMConfig model)
- LayoutLMv2Config configuration class: LayoutLMv2Model (LayoutLMv2Config model)
- LayoutLMv3Config configuration class: LayoutLMv3Model (LayoutLMv3Config model)
- LevitConfig configuration class: LevitModel (LevitConfig model)
- Lfm2Config configuration class: Lfm2Model (Lfm2Config model)
- Lfm2MoeConfig configuration class: Lfm2MoeModel (Lfm2MoeConfig model)
- Lfm2VlConfig configuration class: Lfm2VlModel (Lfm2VlConfig model)
- LightGlueConfig configuration class: LightGlueForKeypointMatching (LightGlueConfig model)
- LightOnOcrConfig configuration class: LightOnOcrModel (LightOnOcrConfig model)
- LiltConfig configuration class: LiltModel (LiltConfig model)
- Llama4Config configuration class: Llama4ForConditionalGeneration (Llama4Config model)
- Llama4TextConfig configuration class: Llama4TextModel (Llama4TextConfig model)
- LlamaConfig configuration class: LlamaModel (LlamaConfig model)
- LlavaConfig configuration class: LlavaModel (LlavaConfig model)
- LlavaNextConfig configuration class: LlavaNextModel (LlavaNextConfig model)
- LlavaNextVideoConfig configuration class: LlavaNextVideoModel (LlavaNextVideoConfig model)
- LlavaOnevisionConfig configuration class: LlavaOnevisionModel (LlavaOnevisionConfig model)
- LongT5Config configuration class: LongT5Model (LongT5Config model)
- LongcatFlashConfig configuration class: LongcatFlashModel (LongcatFlashConfig model)
- LongformerConfig configuration class: LongformerModel (LongformerConfig model)
- LukeConfig configuration class: LukeModel (LukeConfig model)
- LwDetrConfig configuration class: LwDetrModel (LwDetrConfig model)
- LxmertConfig configuration class: LxmertModel (LxmertConfig model)
- M2M100Config configuration class: M2M100Model (M2M100Config model)
- MBartConfig configuration class: MBartModel (MBartConfig model)
- MLCDVisionConfig configuration class: MLCDVisionModel (MLCDVisionConfig model)
- MMGroundingDinoConfig configuration class: MMGroundingDinoModel (MMGroundingDinoConfig model)
- MPNetConfig configuration class: MPNetModel (MPNetConfig model)
- MT5Config configuration class: MT5Model (MT5Config model)
- Mamba2Config configuration class: Mamba2Model (Mamba2Config model)
- MambaConfig configuration class: MambaModel (MambaConfig model)
- MarianConfig configuration class: MarianModel (MarianConfig model)
- MarkupLMConfig configuration class: MarkupLMModel (MarkupLMConfig model)
- Mask2FormerConfig configuration class: Mask2FormerModel (Mask2FormerConfig model)
- MaskFormerConfig configuration class: MaskFormerModel (MaskFormerConfig model)
MaskFormerSwinConfigconfiguration class:MaskFormerSwinModel(MaskFormerSwinConfig model)- MegatronBertConfig configuration class: MegatronBertModel (MegatronBertConfig model)
- MetaClip2Config configuration class: MetaClip2Model (MetaClip2Config model)
- MgpstrConfig configuration class: MgpstrForSceneTextRecognition (MgpstrConfig model)
- MimiConfig configuration class: MimiModel (MimiConfig model)
- MiniMaxConfig configuration class: MiniMaxModel (MiniMaxConfig model)
- MiniMaxM2Config configuration class: MiniMaxM2Model (MiniMaxM2Config model)
- Ministral3Config configuration class: Ministral3Model (Ministral3Config model)
- MinistralConfig configuration class: MinistralModel (MinistralConfig model)
- Mistral3Config configuration class: Mistral3Model (Mistral3Config model)
- Mistral4Config configuration class: Mistral4Model (Mistral4Config model)
- MistralConfig configuration class: MistralModel (MistralConfig model)
- MixtralConfig configuration class: MixtralModel (MixtralConfig model)
- MllamaConfig configuration class: MllamaModel (MllamaConfig model)
- MobileBertConfig configuration class: MobileBertModel (MobileBertConfig model)
- MobileNetV1Config configuration class: MobileNetV1Model (MobileNetV1Config model)
- MobileNetV2Config configuration class: MobileNetV2Model (MobileNetV2Config model)
- MobileViTConfig configuration class: MobileViTModel (MobileViTConfig model)
- MobileViTV2Config configuration class: MobileViTV2Model (MobileViTV2Config model)
- ModernBertConfig configuration class: ModernBertModel (ModernBertConfig model)
- ModernBertDecoderConfig configuration class: ModernBertDecoderModel (ModernBertDecoderConfig model)
- ModernVBertConfig configuration class: ModernVBertModel (ModernVBertConfig model)
- MoonshineConfig configuration class: MoonshineModel (MoonshineConfig model)
- MoonshineStreamingConfig configuration class: MoonshineStreamingModel (MoonshineStreamingConfig model)
- MoshiConfig configuration class: MoshiModel (MoshiConfig model)
- MptConfig configuration class: MptModel (MptConfig model)
- MraConfig configuration class: MraModel (MraConfig model)
- MusicFlamingoConfig configuration class: MusicFlamingoForConditionalGeneration (MusicFlamingoConfig model)
- MusicgenConfig configuration class: MusicgenModel (MusicgenConfig model)
- MusicgenMelodyConfig configuration class: MusicgenMelodyModel (MusicgenMelodyConfig model)
- MvpConfig configuration class: MvpModel (MvpConfig model)
- NanoChatConfig configuration class: NanoChatModel (NanoChatConfig model)
- NemotronConfig configuration class: NemotronModel (NemotronConfig model)
- NemotronHConfig configuration class: NemotronHModel (NemotronHConfig model)
- NllbMoeConfig configuration class: NllbMoeModel (NllbMoeConfig model)
- NomicBertConfig configuration class: NomicBertModel (NomicBertConfig model)
- NystromformerConfig configuration class: NystromformerModel (NystromformerConfig model)
- OPTConfig configuration class: OPTModel (OPTConfig model)
- Olmo2Config configuration class: Olmo2Model (Olmo2Config model)
- Olmo3Config configuration class: Olmo3Model (Olmo3Config model)
- OlmoConfig configuration class: OlmoModel (OlmoConfig model)
- OlmoHybridConfig configuration class: OlmoHybridModel (OlmoHybridConfig model)
- OlmoeConfig configuration class: OlmoeModel (OlmoeConfig model)
- OmDetTurboConfig configuration class: OmDetTurboForObjectDetection (OmDetTurboConfig model)
- OneFormerConfig configuration class: OneFormerModel (OneFormerConfig model)
- OpenAIGPTConfig configuration class: OpenAIGPTModel (OpenAIGPTConfig model)
- Ovis2Config configuration class: Ovis2Model (Ovis2Config model)
- OwlViTConfig configuration class: OwlViTModel (OwlViTConfig model)
- Owlv2Config configuration class: Owlv2Model (Owlv2Config model)
- PI0Config configuration class: PI0Model (PI0Config model)
- PLBartConfig configuration class: PLBartModel (PLBartConfig model)
- PPDocLayoutV3Config configuration class: PPDocLayoutV3Model (PPDocLayoutV3Config model)
- PPOCRV5MobileRecConfig configuration class: PPOCRV5MobileRecModel (PPOCRV5MobileRecConfig model)
- PPOCRV5ServerRecConfig configuration class: PPOCRV5ServerRecModel (PPOCRV5ServerRecConfig model)
- PaliGemmaConfig configuration class: PaliGemmaModel (PaliGemmaConfig model)
- ParakeetCTCConfig configuration class: ParakeetForCTC (ParakeetCTCConfig model)
- ParakeetEncoderConfig configuration class: ParakeetEncoder (ParakeetEncoderConfig model)
- PatchTSMixerConfig configuration class: PatchTSMixerModel (PatchTSMixerConfig model)
- PatchTSTConfig configuration class: PatchTSTModel (PatchTSTConfig model)
- PeAudioConfig configuration class: PeAudioModel (PeAudioConfig model)
- PeAudioEncoderConfig configuration class: PeAudioEncoder (PeAudioEncoderConfig model)
- PeAudioVideoConfig configuration class: PeAudioVideoModel (PeAudioVideoConfig model)
- PeAudioVideoEncoderConfig configuration class: PeAudioVideoEncoder (PeAudioVideoEncoderConfig model)
- PeVideoConfig configuration class: PeVideoModel (PeVideoConfig model)
- PeVideoEncoderConfig configuration class: PeVideoEncoder (PeVideoEncoderConfig model)
- PegasusConfig configuration class: PegasusModel (PegasusConfig model)
- PegasusXConfig configuration class: PegasusXModel (PegasusXConfig model)
- PerceiverConfig configuration class: PerceiverModel (PerceiverConfig model)
- PerceptionLMConfig configuration class: PerceptionLMModel (PerceptionLMConfig model)
- PersimmonConfig configuration class: PersimmonModel (PersimmonConfig model)
- Phi3Config configuration class: Phi3Model (Phi3Config model)
- Phi4MultimodalConfig configuration class: Phi4MultimodalModel (Phi4MultimodalConfig model)
- PhiConfig configuration class: PhiModel (PhiConfig model)
- PhimoeConfig configuration class: PhimoeModel (PhimoeConfig model)
- PixioConfig configuration class: PixioModel (PixioConfig model)
- PixtralVisionConfig configuration class: PixtralVisionModel (PixtralVisionConfig model)
- PoolFormerConfig configuration class: PoolFormerModel (PoolFormerConfig model)
- ProphetNetConfig configuration class: ProphetNetModel (ProphetNetConfig model)
- PvtConfig configuration class: PvtModel (PvtConfig model)
- PvtV2Config configuration class: PvtV2Model (PvtV2Config model)
- Qwen2AudioEncoderConfig configuration class: Qwen2AudioEncoder (Qwen2AudioEncoderConfig model)
- Qwen2Config configuration class: Qwen2Model (Qwen2Config model)
- Qwen2MoeConfig configuration class: Qwen2MoeModel (Qwen2MoeConfig model)
- Qwen2VLConfig configuration class: Qwen2VLModel (Qwen2VLConfig model)
- Qwen2VLTextConfig configuration class: Qwen2VLTextModel (Qwen2VLTextConfig model)
- Qwen2_5_VLConfig configuration class: Qwen2_5_VLModel (Qwen2_5_VLConfig model)
- Qwen2_5_VLTextConfig configuration class: Qwen2_5_VLTextModel (Qwen2_5_VLTextConfig model)
- Qwen3Config configuration class: Qwen3Model (Qwen3Config model)
- Qwen3MoeConfig configuration class: Qwen3MoeModel (Qwen3MoeConfig model)
- Qwen3NextConfig configuration class: Qwen3NextModel (Qwen3NextConfig model)
- Qwen3VLConfig configuration class: Qwen3VLModel (Qwen3VLConfig model)
- Qwen3VLMoeConfig configuration class: Qwen3VLMoeModel (Qwen3VLMoeConfig model)
- Qwen3VLMoeTextConfig configuration class: Qwen3VLMoeTextModel (Qwen3VLMoeTextConfig model)
- Qwen3VLTextConfig configuration class: Qwen3VLTextModel (Qwen3VLTextConfig model)
- Qwen3_5Config configuration class: Qwen3_5Model (Qwen3_5Config model)
- Qwen3_5MoeConfig configuration class: Qwen3_5MoeModel (Qwen3_5MoeConfig model)
- Qwen3_5MoeTextConfig configuration class: Qwen3_5MoeTextModel (Qwen3_5MoeTextConfig model)
- Qwen3_5TextConfig configuration class: Qwen3_5TextModel (Qwen3_5TextConfig model)
- RTDetrConfig configuration class: RTDetrModel (RTDetrConfig model)
- RTDetrV2Config configuration class: RTDetrV2Model (RTDetrV2Config model)
- RecurrentGemmaConfig configuration class: RecurrentGemmaModel (RecurrentGemmaConfig model)
- ReformerConfig configuration class: ReformerModel (ReformerConfig model)
- RegNetConfig configuration class: RegNetModel (RegNetConfig model)
- RemBertConfig configuration class: RemBertModel (RemBertConfig model)
- ResNetConfig configuration class: ResNetModel (ResNetConfig model)
- RoCBertConfig configuration class: RoCBertModel (RoCBertConfig model)
- RoFormerConfig configuration class: RoFormerModel (RoFormerConfig model)
- RobertaConfig configuration class: RobertaModel (RobertaConfig model)
- RobertaPreLayerNormConfig configuration class: RobertaPreLayerNormModel (RobertaPreLayerNormConfig model)
- RwkvConfig configuration class: RwkvModel (RwkvConfig model)
- SEWConfig configuration class: SEWModel (SEWConfig model)
- SEWDConfig configuration class: SEWDModel (SEWDConfig model)
- Sam2Config configuration class: Sam2Model (Sam2Config model)
- Sam2HieraDetConfig configuration class: Sam2HieraDetModel (Sam2HieraDetConfig model)
- Sam2VideoConfig configuration class: Sam2VideoModel (Sam2VideoConfig model)
- Sam2VisionConfig configuration class: Sam2VisionModel (Sam2VisionConfig model)
- Sam3Config configuration class: Sam3Model (Sam3Config model)
- Sam3LiteTextConfig configuration class: Sam3LiteTextModel (Sam3LiteTextConfig model)
- Sam3LiteTextTextConfig configuration class: Sam3LiteTextTextModel (Sam3LiteTextTextConfig model)
- Sam3TrackerConfig configuration class: Sam3TrackerModel (Sam3TrackerConfig model)
- Sam3TrackerVideoConfig configuration class: Sam3TrackerVideoModel (Sam3TrackerVideoConfig model)
- Sam3ViTConfig configuration class: Sam3ViTModel (Sam3ViTConfig model)
- Sam3VideoConfig configuration class: Sam3VideoModel (Sam3VideoConfig model)
- Sam3VisionConfig configuration class: Sam3VisionModel (Sam3VisionConfig model)
- SamConfig configuration class: SamModel (SamConfig model)
- SamHQConfig configuration class: SamHQModel (SamHQConfig model)
- SamHQVisionConfig configuration class: SamHQVisionModel (SamHQVisionConfig model)
- SamVisionConfig configuration class: SamVisionModel (SamVisionConfig model)
- SeamlessM4TConfig configuration class: SeamlessM4TModel (SeamlessM4TConfig model)
- SeamlessM4Tv2Config configuration class: SeamlessM4Tv2Model (SeamlessM4Tv2Config model)
- SeedOssConfig configuration class: SeedOssModel (SeedOssConfig model)
- SegGptConfig configuration class: SegGptModel (SegGptConfig model)
- SegformerConfig configuration class: SegformerModel (SegformerConfig model)
- Siglip2Config configuration class: Siglip2Model (Siglip2Config model)
- Siglip2VisionConfig configuration class: Siglip2VisionModel (Siglip2VisionConfig model)
- SiglipConfig configuration class: SiglipModel (SiglipConfig model)
- SiglipVisionConfig configuration class: SiglipVisionModel (SiglipVisionConfig model)
- SmolLM3Config configuration class: SmolLM3Model (SmolLM3Config model)
- SmolVLMConfig configuration class: SmolVLMModel (SmolVLMConfig model)
- SmolVLMVisionConfig configuration class: SmolVLMVisionTransformer (SmolVLMVisionConfig model)
- SolarOpenConfig configuration class: SolarOpenModel (SolarOpenConfig model)
- Speech2TextConfig configuration class: Speech2TextModel (Speech2TextConfig model)
- SpeechT5Config configuration class: SpeechT5Model (SpeechT5Config model)
- SplinterConfig configuration class: SplinterModel (SplinterConfig model)
- SqueezeBertConfig configuration class: SqueezeBertModel (SqueezeBertConfig model)
- StableLmConfig configuration class: StableLmModel (StableLmConfig model)
- Starcoder2Config configuration class: Starcoder2Model (Starcoder2Config model)
- SwiftFormerConfig configuration class: SwiftFormerModel (SwiftFormerConfig model)
- Swin2SRConfig configuration class: Swin2SRModel (Swin2SRConfig model)
- SwinConfig configuration class: SwinModel (SwinConfig model)
- Swinv2Config configuration class: Swinv2Model (Swinv2Config model)
- SwitchTransformersConfig configuration class: SwitchTransformersModel (SwitchTransformersConfig model)
- T5Config configuration class: T5Model (T5Config model)
- T5Gemma2Config configuration class: T5Gemma2Model (T5Gemma2Config model)
- T5Gemma2EncoderConfig configuration class:
T5Gemma2Encoder(T5Gemma2EncoderConfig model) - T5GemmaConfig configuration class: T5GemmaModel (T5GemmaConfig model)
- TableTransformerConfig configuration class: TableTransformerModel (TableTransformerConfig model)
- TapasConfig configuration class: TapasModel (TapasConfig model)
- TextNetConfig configuration class: TextNetModel (TextNetConfig model)
- TimeSeriesTransformerConfig configuration class: TimeSeriesTransformerModel (TimeSeriesTransformerConfig model)
- TimesFm2_5Config configuration class: TimesFm2_5Model (TimesFm2_5Config model)
- TimesFmConfig configuration class: TimesFmModel (TimesFmConfig model)
- TimesformerConfig configuration class: TimesformerModel (TimesformerConfig model)
- TimmBackboneConfig configuration class: TimmBackbone (TimmBackboneConfig model)
- TimmWrapperConfig configuration class: TimmWrapperModel (TimmWrapperConfig model)
- TvpConfig configuration class: TvpModel (TvpConfig model)
- UMT5Config configuration class: UMT5Model (UMT5Config model)
- UVDocConfig configuration class: UVDocModel (UVDocConfig model)
- UdopConfig configuration class: UdopModel (UdopConfig model)
- UniSpeechConfig configuration class: UniSpeechModel (UniSpeechConfig model)
- UniSpeechSatConfig configuration class: UniSpeechSatModel (UniSpeechSatConfig model)
- UnivNetConfig configuration class: UnivNetModel (UnivNetConfig model)
- VJEPA2Config configuration class: VJEPA2Model (VJEPA2Config model)
- VaultGemmaConfig configuration class: VaultGemmaModel (VaultGemmaConfig model)
- ViTConfig configuration class: ViTModel (ViTConfig model)
- ViTMAEConfig configuration class: ViTMAEModel (ViTMAEConfig model)
- ViTMSNConfig configuration class: ViTMSNModel (ViTMSNConfig model)
- VibeVoiceAcousticTokenizerConfig configuration class: VibeVoiceAcousticTokenizerModel (VibeVoiceAcousticTokenizerConfig model)
- VibeVoiceAcousticTokenizerDecoderConfig configuration class: VibeVoiceAcousticTokenizerDecoderModel (VibeVoiceAcousticTokenizerDecoderConfig model)
- VibeVoiceAcousticTokenizerEncoderConfig configuration class: VibeVoiceAcousticTokenizerEncoderModel (VibeVoiceAcousticTokenizerEncoderConfig model)
- VibeVoiceAsrConfig configuration class: VibeVoiceAsrForConditionalGeneration (VibeVoiceAsrConfig model)
- VideoLlama3Config configuration class: VideoLlama3Model (VideoLlama3Config model)
- VideoLlama3VisionConfig configuration class: VideoLlama3VisionModel (VideoLlama3VisionConfig model)
- VideoLlavaConfig configuration class: VideoLlavaModel (VideoLlavaConfig model)
- VideoMAEConfig configuration class: VideoMAEModel (VideoMAEConfig model)
- ViltConfig configuration class: ViltModel (ViltConfig model)
- VipLlavaConfig configuration class: VipLlavaModel (VipLlavaConfig model)
- VisionTextDualEncoderConfig configuration class: VisionTextDualEncoderModel (VisionTextDualEncoderConfig model)
- VisualBertConfig configuration class: VisualBertModel (VisualBertConfig model)
- VitDetConfig configuration class: VitDetModel (VitDetConfig model)
- VitsConfig configuration class: VitsModel (VitsConfig model)
- VivitConfig configuration class: VivitModel (VivitConfig model)
- VoxtralConfig configuration class: VoxtralForConditionalGeneration (VoxtralConfig model)
- VoxtralEncoderConfig configuration class: VoxtralEncoder (VoxtralEncoderConfig model)
- VoxtralRealtimeConfig configuration class: VoxtralRealtimeForConditionalGeneration (VoxtralRealtimeConfig model)
- VoxtralRealtimeEncoderConfig configuration class: VoxtralRealtimeEncoder (VoxtralRealtimeEncoderConfig model)
- VoxtralRealtimeTextConfig configuration class:
VoxtralRealtimeTextModel(VoxtralRealtimeTextConfig model) - Wav2Vec2BertConfig configuration class: Wav2Vec2BertModel (Wav2Vec2BertConfig model)
- Wav2Vec2Config configuration class: Wav2Vec2Model (Wav2Vec2Config model)
- Wav2Vec2ConformerConfig configuration class: Wav2Vec2ConformerModel (Wav2Vec2ConformerConfig model)
- WavLMConfig configuration class: WavLMModel (WavLMConfig model)
- WhisperConfig configuration class: WhisperModel (WhisperConfig model)
- XCLIPConfig configuration class: XCLIPModel (XCLIPConfig model)
- XGLMConfig configuration class: XGLMModel (XGLMConfig model)
- XLMConfig configuration class: XLMModel (XLMConfig model)
- XLMRobertaConfig configuration class: XLMRobertaModel (XLMRobertaConfig model)
- XLMRobertaXLConfig configuration class: XLMRobertaXLModel (XLMRobertaXLConfig model)
- XLNetConfig configuration class: XLNetModel (XLNetConfig model)
- XcodecConfig configuration class: XcodecModel (XcodecConfig model)
- XmodConfig configuration class: XmodModel (XmodConfig model)
- YolosConfig configuration class: YolosModel (YolosConfig model)
- YosoConfig configuration class: YosoModel (YosoConfig model)
- YoutuConfig configuration class: YoutuModel (YoutuConfig model)
- Zamba2Config configuration class: Zamba2Model (Zamba2Config model)
- ZambaConfig configuration class: ZambaModel (ZambaConfig model)
- xLSTMConfig configuration class: xLSTMModel (xLSTMConfig model)
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the base model classes of the library from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the base model classes of the library from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- afmoe — AfmoeModel (AfmoeConfig model)
- aimv2 — Aimv2Model (Aimv2Config model)
- aimv2_vision_model — Aimv2VisionModel (Aimv2VisionConfig model)
- albert —
AlbertModel(AlbertConfig model) - align — AlignModel (AlignConfig model)
- altclip — AltCLIPModel (AltCLIPConfig model)
- apertus — ApertusModel (ApertusConfig model)
- arcee — ArceeModel (ArceeConfig model)
- aria — AriaModel (AriaConfig model)
- aria_text — AriaTextModel (AriaTextConfig model)
- audio-spectrogram-transformer — ASTModel (ASTConfig model)
- audioflamingo3 — AudioFlamingo3ForConditionalGeneration (AudioFlamingo3Config model)
- audioflamingo3_encoder — AudioFlamingo3Encoder (AudioFlamingo3EncoderConfig model)
- autoformer — AutoformerModel (AutoformerConfig model)
- aya_vision — AyaVisionModel (AyaVisionConfig model)
- bamba — BambaModel (BambaConfig model)
- bark — BarkModel (BarkConfig model)
- bart — BartModel (BartConfig model)
- beit — BeitModel (BeitConfig model)
- bert — BertModel (BertConfig model)
- bert-generation — BertGenerationEncoder (BertGenerationConfig model)
- big_bird — BigBirdModel (BigBirdConfig model)
- bigbird_pegasus — BigBirdPegasusModel (BigBirdPegasusConfig model)
- biogpt — BioGptModel (BioGptConfig model)
- bit — BitModel (BitConfig model)
- bitnet — BitNetModel (BitNetConfig model)
- blenderbot — BlenderbotModel (BlenderbotConfig model)
- blenderbot-small — BlenderbotSmallModel (BlenderbotSmallConfig model)
- blip — BlipModel (BlipConfig model)
- blip-2 — Blip2Model (Blip2Config model)
- blip_2_qformer — Blip2QFormerModel (Blip2QFormerConfig model)
- bloom — BloomModel (BloomConfig model)
- blt — BltModel (BltConfig model)
- bridgetower — BridgeTowerModel (BridgeTowerConfig model)
- bros — BrosModel (BrosConfig model)
- camembert — CamembertModel (CamembertConfig model)
- canine — CanineModel (CanineConfig model)
- chameleon — ChameleonModel (ChameleonConfig model)
- chinese_clip — ChineseCLIPModel (ChineseCLIPConfig model)
- chinese_clip_vision_model — ChineseCLIPVisionModel (ChineseCLIPVisionConfig model)
- clap — ClapModel (ClapConfig model)
- clip — CLIPModel (CLIPConfig model)
- clip_text_model — CLIPTextModel (CLIPTextConfig model)
- clip_vision_model — CLIPVisionModel (CLIPVisionConfig model)
- clipseg — CLIPSegModel (CLIPSegConfig model)
- clvp — ClvpModelForConditionalGeneration (ClvpConfig model)
- codegen — CodeGenModel (CodeGenConfig model)
- cohere — CohereModel (CohereConfig model)
- cohere2 — Cohere2Model (Cohere2Config model)
- cohere2_vision — Cohere2VisionModel (Cohere2VisionConfig model)
- cohere_asr — CohereAsrModel (CohereAsrConfig model)
- conditional_detr — ConditionalDetrModel (ConditionalDetrConfig model)
- convbert — ConvBertModel (ConvBertConfig model)
- convnext — ConvNextModel (ConvNextConfig model)
- convnextv2 — ConvNextV2Model (ConvNextV2Config model)
- cpmant — CpmAntModel (CpmAntConfig model)
- csm — CsmForConditionalGeneration (CsmConfig model)
- ctrl — CTRLModel (CTRLConfig model)
- cvt — CvtModel (CvtConfig model)
- cwm — CwmModel (CwmConfig model)
- d_fine — DFineModel (DFineConfig model)
- dab-detr — DabDetrModel (DabDetrConfig model)
- dac — DacModel (DacConfig model)
- data2vec-audio — Data2VecAudioModel (Data2VecAudioConfig model)
- data2vec-text — Data2VecTextModel (Data2VecTextConfig model)
- data2vec-vision — Data2VecVisionModel (Data2VecVisionConfig model)
- dbrx — DbrxModel (DbrxConfig model)
- deberta — DebertaModel (DebertaConfig model)
- deberta-v2 — DebertaV2Model (DebertaV2Config model)
- decision_transformer — DecisionTransformerModel (DecisionTransformerConfig model)
- deepseek_v2 — DeepseekV2Model (DeepseekV2Config model)
- deepseek_v3 — DeepseekV3Model (DeepseekV3Config model)
- deepseek_vl — DeepseekVLModel (DeepseekVLConfig model)
- deepseek_vl_hybrid — DeepseekVLHybridModel (DeepseekVLHybridConfig model)
- deformable_detr — DeformableDetrModel (DeformableDetrConfig model)
- deit — DeiTModel (DeiTConfig model)
- depth_pro — DepthProModel (DepthProConfig model)
- detr — DetrModel (DetrConfig model)
- dia — DiaModel (DiaConfig model)
- diffllama — DiffLlamaModel (DiffLlamaConfig model)
- dinat — DinatModel (DinatConfig model)
- dinov2 — Dinov2Model (Dinov2Config model)
- dinov2_with_registers — Dinov2WithRegistersModel (Dinov2WithRegistersConfig model)
- dinov3_convnext — DINOv3ConvNextModel (DINOv3ConvNextConfig model)
- dinov3_vit — DINOv3ViTModel (DINOv3ViTConfig model)
- distilbert — DistilBertModel (DistilBertConfig model)
- doge — DogeModel (DogeConfig model)
- donut-swin — DonutSwinModel (DonutSwinConfig model)
- dots1 — Dots1Model (Dots1Config model)
- dpr — DPRQuestionEncoder (DPRConfig model)
- dpt — DPTModel (DPTConfig model)
- edgetam — EdgeTamModel (EdgeTamConfig model)
- edgetam_video — EdgeTamVideoModel (EdgeTamVideoConfig model)
- edgetam_vision_model — EdgeTamVisionModel (EdgeTamVisionConfig model)
- efficientloftr — EfficientLoFTRModel (EfficientLoFTRConfig model)
- efficientnet — EfficientNetModel (EfficientNetConfig model)
- electra — ElectraModel (ElectraConfig model)
- emu3 — Emu3Model (Emu3Config model)
- encodec — EncodecModel (EncodecConfig model)
- ernie — ErnieModel (ErnieConfig model)
- ernie4_5 — Ernie4_5Model (Ernie4_5Config model)
- ernie4_5_moe — Ernie4_5_MoeModel (Ernie4_5_MoeConfig model)
- ernie4_5_vl_moe — Ernie4_5_VLMoeModel (Ernie4_5_VLMoeConfig model)
- esm — EsmModel (EsmConfig model)
- eurobert — EuroBertModel (EuroBertConfig model)
- evolla — EvollaModel (EvollaConfig model)
- exaone4 — Exaone4Model (Exaone4Config model)
- exaone_moe — ExaoneMoeModel (ExaoneMoeConfig model)
- falcon — FalconModel (FalconConfig model)
- falcon_h1 — FalconH1Model (FalconH1Config model)
- falcon_mamba — FalconMambaModel (FalconMambaConfig model)
- fast_vlm — FastVlmModel (FastVlmConfig model)
- fastspeech2_conformer — FastSpeech2ConformerModel (FastSpeech2ConformerConfig model)
- fastspeech2_conformer_with_hifigan — FastSpeech2ConformerWithHifiGan (FastSpeech2ConformerWithHifiGanConfig model)
- flaubert — FlaubertModel (FlaubertConfig model)
- flava — FlavaModel (FlavaConfig model)
- flex_olmo — FlexOlmoModel (FlexOlmoConfig model)
- florence2 — Florence2Model (Florence2Config model)
- fnet — FNetModel (FNetConfig model)
- focalnet — FocalNetModel (FocalNetConfig model)
- fsmt — FSMTModel (FSMTConfig model)
- funnel — FunnelModel or FunnelBaseModel (FunnelConfig model)
- fuyu — FuyuModel (FuyuConfig model)
- gemma — GemmaModel (GemmaConfig model)
- gemma2 — Gemma2Model (Gemma2Config model)
- gemma3 — Gemma3Model (Gemma3Config model)
- gemma3_text — Gemma3TextModel (Gemma3TextConfig model)
- gemma3n — Gemma3nModel (Gemma3nConfig model)
- gemma3n_audio —
Gemma3nAudioEncoder(Gemma3nAudioConfig model) - gemma3n_text — Gemma3nTextModel (Gemma3nTextConfig model)
- gemma3n_vision — TimmWrapperModel (Gemma3nVisionConfig model)
- gemma4 — Gemma4Model (Gemma4Config model)
- gemma4_audio — Gemma4AudioModel (Gemma4AudioConfig model)
- gemma4_text — Gemma4TextModel (Gemma4TextConfig model)
- gemma4_vision — Gemma4VisionModel (Gemma4VisionConfig model)
- git — GitModel (GitConfig model)
- glm — GlmModel (GlmConfig model)
- glm4 — Glm4Model (Glm4Config model)
- glm46v — Glm46VModel (Glm46VConfig model)
- glm4_moe — Glm4MoeModel (Glm4MoeConfig model)
- glm4_moe_lite — Glm4MoeLiteModel (Glm4MoeLiteConfig model)
- glm4v — Glm4vModel (Glm4vConfig model)
- glm4v_moe — Glm4vMoeModel (Glm4vMoeConfig model)
- glm4v_moe_text — Glm4vMoeTextModel (Glm4vMoeTextConfig model)
- glm4v_moe_vision — Glm4vMoeVisionModel (Glm4vMoeVisionConfig model)
- glm4v_text — Glm4vTextModel (Glm4vTextConfig model)
- glm4v_vision — Glm4vVisionModel (Glm4vVisionConfig model)
- glm_image — GlmImageModel (GlmImageConfig model)
- glm_image_text — GlmImageTextModel (GlmImageTextConfig model)
- glm_image_vision — GlmImageVisionModel (GlmImageVisionConfig model)
- glm_image_vqmodel — GlmImageVQVAE (GlmImageVQVAEConfig model)
- glm_moe_dsa — GlmMoeDsaModel (GlmMoeDsaConfig model)
- glm_ocr — GlmOcrModel (GlmOcrConfig model)
- glm_ocr_text — GlmOcrTextModel (GlmOcrTextConfig model)
- glm_ocr_vision — GlmOcrVisionModel (GlmOcrVisionConfig model)
- glmasr — GlmAsrForConditionalGeneration (GlmAsrConfig model)
- glmasr_encoder — GlmAsrEncoder (GlmAsrEncoderConfig model)
- glpn — GLPNModel (GLPNConfig model)
- got_ocr2 — GotOcr2Model (GotOcr2Config model)
- gpt-sw3 — GPT2Model (GPT2Config model)
- gpt2 — GPT2Model (GPT2Config model)
- gpt_bigcode — GPTBigCodeModel (GPTBigCodeConfig model)
- gpt_neo — GPTNeoModel (GPTNeoConfig model)
- gpt_neox — GPTNeoXModel (GPTNeoXConfig model)
- gpt_neox_japanese — GPTNeoXJapaneseModel (GPTNeoXJapaneseConfig model)
- gpt_oss — GptOssModel (GptOssConfig model)
- gptj — GPTJModel (GPTJConfig model)
- granite — GraniteModel (GraniteConfig model)
- granitemoe — GraniteMoeModel (GraniteMoeConfig model)
- granitemoehybrid — GraniteMoeHybridModel (GraniteMoeHybridConfig model)
- granitemoeshared — GraniteMoeSharedModel (GraniteMoeSharedConfig model)
- grounding-dino — GroundingDinoModel (GroundingDinoConfig model)
- groupvit — GroupViTModel (GroupViTConfig model)
- helium — HeliumModel (HeliumConfig model)
- hgnet_v2 — HGNetV2Backbone (HGNetV2Config model)
- hiera — HieraModel (HieraConfig model)
- higgs_audio_v2 — HiggsAudioV2ForConditionalGeneration (HiggsAudioV2Config model)
- higgs_audio_v2_tokenizer — HiggsAudioV2TokenizerModel (HiggsAudioV2TokenizerConfig model)
- hubert — HubertModel (HubertConfig model)
- hunyuan_v1_dense — HunYuanDenseV1Model (HunYuanDenseV1Config model)
- hunyuan_v1_moe — HunYuanMoEV1Model (HunYuanMoEV1Config model)
- ibert — IBertModel (IBertConfig model)
- idefics — IdeficsModel (IdeficsConfig model)
- idefics2 — Idefics2Model (Idefics2Config model)
- idefics3 — Idefics3Model (Idefics3Config model)
- idefics3_vision — Idefics3VisionTransformer (Idefics3VisionConfig model)
- ijepa — IJepaModel (IJepaConfig model)
- imagegpt — ImageGPTModel (ImageGPTConfig model)
- informer — InformerModel (InformerConfig model)
- instructblip — InstructBlipModel (InstructBlipConfig model)
- instructblipvideo — InstructBlipVideoModel (InstructBlipVideoConfig model)
- internvl — InternVLModel (InternVLConfig model)
- internvl_vision — InternVLVisionModel (InternVLVisionConfig model)
- jais2 — Jais2Model (Jais2Config model)
- jamba — JambaModel (JambaConfig model)
- janus — JanusModel (JanusConfig model)
- jetmoe — JetMoeModel (JetMoeConfig model)
- jina_embeddings_v3 — JinaEmbeddingsV3Model (JinaEmbeddingsV3Config model)
- kosmos-2 — Kosmos2Model (Kosmos2Config model)
- kosmos-2.5 — Kosmos2_5Model (Kosmos2_5Config model)
- kyutai_speech_to_text — KyutaiSpeechToTextModel (KyutaiSpeechToTextConfig model)
- lasr_ctc — LasrForCTC (LasrCTCConfig model)
- lasr_encoder — LasrEncoder (LasrEncoderConfig model)
- layoutlm — LayoutLMModel (LayoutLMConfig model)
- layoutlmv2 — LayoutLMv2Model (LayoutLMv2Config model)
- layoutlmv3 — LayoutLMv3Model (LayoutLMv3Config model)
- led — LEDModel (LEDConfig model)
- levit — LevitModel (LevitConfig model)
- lfm2 — Lfm2Model (Lfm2Config model)
- lfm2_moe — Lfm2MoeModel (Lfm2MoeConfig model)
- lfm2_vl — Lfm2VlModel (Lfm2VlConfig model)
- lightglue — LightGlueForKeypointMatching (LightGlueConfig model)
- lighton_ocr — LightOnOcrModel (LightOnOcrConfig model)
- lilt — LiltModel (LiltConfig model)
- llama — LlamaModel (LlamaConfig model)
- llama4 — Llama4ForConditionalGeneration (Llama4Config model)
- llama4_text — Llama4TextModel (Llama4TextConfig model)
- llava — LlavaModel (LlavaConfig model)
- llava_next — LlavaNextModel (LlavaNextConfig model)
- llava_next_video — LlavaNextVideoModel (LlavaNextVideoConfig model)
- llava_onevision — LlavaOnevisionModel (LlavaOnevisionConfig model)
- longcat_flash — LongcatFlashModel (LongcatFlashConfig model)
- longformer — LongformerModel (LongformerConfig model)
- longt5 — LongT5Model (LongT5Config model)
- luke — LukeModel (LukeConfig model)
- lw_detr — LwDetrModel (LwDetrConfig model)
- lxmert — LxmertModel (LxmertConfig model)
- m2m_100 — M2M100Model (M2M100Config model)
- mamba — MambaModel (MambaConfig model)
- mamba2 — Mamba2Model (Mamba2Config model)
- marian — MarianModel (MarianConfig model)
- markuplm — MarkupLMModel (MarkupLMConfig model)
- mask2former — Mask2FormerModel (Mask2FormerConfig model)
- maskformer — MaskFormerModel (MaskFormerConfig model)
- maskformer-swin —
MaskFormerSwinModel(MaskFormerSwinConfig model) - mbart — MBartModel (MBartConfig model)
- megatron-bert — MegatronBertModel (MegatronBertConfig model)
- metaclip_2 — MetaClip2Model (MetaClip2Config model)
- mgp-str — MgpstrForSceneTextRecognition (MgpstrConfig model)
- mimi — MimiModel (MimiConfig model)
- minimax — MiniMaxModel (MiniMaxConfig model)
- minimax_m2 — MiniMaxM2Model (MiniMaxM2Config model)
- ministral — MinistralModel (MinistralConfig model)
- ministral3 — Ministral3Model (Ministral3Config model)
- mistral — MistralModel (MistralConfig model)
- mistral3 — Mistral3Model (Mistral3Config model)
- mistral4 — Mistral4Model (Mistral4Config model)
- mixtral — MixtralModel (MixtralConfig model)
- mlcd — MLCDVisionModel (MLCDVisionConfig model)
- mlcd_vision_model — MLCDVisionModel (MLCDVisionConfig model)
- mllama — MllamaModel (MllamaConfig model)
- mm-grounding-dino — MMGroundingDinoModel (MMGroundingDinoConfig model)
- mobilebert — MobileBertModel (MobileBertConfig model)
- mobilenet_v1 — MobileNetV1Model (MobileNetV1Config model)
- mobilenet_v2 — MobileNetV2Model (MobileNetV2Config model)
- mobilevit — MobileViTModel (MobileViTConfig model)
- mobilevitv2 — MobileViTV2Model (MobileViTV2Config model)
- modernbert — ModernBertModel (ModernBertConfig model)
- modernbert-decoder — ModernBertDecoderModel (ModernBertDecoderConfig model)
- modernvbert — ModernVBertModel (ModernVBertConfig model)
- moonshine — MoonshineModel (MoonshineConfig model)
- moonshine_streaming — MoonshineStreamingModel (MoonshineStreamingConfig model)
- moshi — MoshiModel (MoshiConfig model)
- mpnet — MPNetModel (MPNetConfig model)
- mpt — MptModel (MptConfig model)
- mra — MraModel (MraConfig model)
- mt5 — MT5Model (MT5Config model)
- musicflamingo — MusicFlamingoForConditionalGeneration (MusicFlamingoConfig model)
- musicgen — MusicgenModel (MusicgenConfig model)
- musicgen_melody — MusicgenMelodyModel (MusicgenMelodyConfig model)
- mvp — MvpModel (MvpConfig model)
- nanochat — NanoChatModel (NanoChatConfig model)
- nemotron — NemotronModel (NemotronConfig model)
- nemotron_h — NemotronHModel (NemotronHConfig model)
- nllb-moe — NllbMoeModel (NllbMoeConfig model)
- nomic_bert — NomicBertModel (NomicBertConfig model)
- nystromformer — NystromformerModel (NystromformerConfig model)
- olmo — OlmoModel (OlmoConfig model)
- olmo2 — Olmo2Model (Olmo2Config model)
- olmo3 — Olmo3Model (Olmo3Config model)
- olmo_hybrid — OlmoHybridModel (OlmoHybridConfig model)
- olmoe — OlmoeModel (OlmoeConfig model)
- omdet-turbo — OmDetTurboForObjectDetection (OmDetTurboConfig model)
- oneformer — OneFormerModel (OneFormerConfig model)
- openai-gpt — OpenAIGPTModel (OpenAIGPTConfig model)
- opt — OPTModel (OPTConfig model)
- ovis2 — Ovis2Model (Ovis2Config model)
- owlv2 — Owlv2Model (Owlv2Config model)
- owlvit — OwlViTModel (OwlViTConfig model)
- paligemma — PaliGemmaModel (PaliGemmaConfig model)
- parakeet_ctc — ParakeetForCTC (ParakeetCTCConfig model)
- parakeet_encoder — ParakeetEncoder (ParakeetEncoderConfig model)
- patchtsmixer — PatchTSMixerModel (PatchTSMixerConfig model)
- patchtst — PatchTSTModel (PatchTSTConfig model)
- pe_audio — PeAudioModel (PeAudioConfig model)
- pe_audio_encoder — PeAudioEncoder (PeAudioEncoderConfig model)
- pe_audio_video — PeAudioVideoModel (PeAudioVideoConfig model)
- pe_audio_video_encoder — PeAudioVideoEncoder (PeAudioVideoEncoderConfig model)
- pe_video — PeVideoModel (PeVideoConfig model)
- pe_video_encoder — PeVideoEncoder (PeVideoEncoderConfig model)
- pegasus — PegasusModel (PegasusConfig model)
- pegasus_x — PegasusXModel (PegasusXConfig model)
- perceiver — PerceiverModel (PerceiverConfig model)
- perception_lm — PerceptionLMModel (PerceptionLMConfig model)
- persimmon — PersimmonModel (PersimmonConfig model)
- phi — PhiModel (PhiConfig model)
- phi3 — Phi3Model (Phi3Config model)
- phi4_multimodal — Phi4MultimodalModel (Phi4MultimodalConfig model)
- phimoe — PhimoeModel (PhimoeConfig model)
- pi0 — PI0Model (PI0Config model)
- pixio — PixioModel (PixioConfig model)
- pixtral — PixtralVisionModel (PixtralVisionConfig model)
- plbart — PLBartModel (PLBartConfig model)
- poolformer — PoolFormerModel (PoolFormerConfig model)
- pp_doclayout_v3 — PPDocLayoutV3Model (PPDocLayoutV3Config model)
- pp_ocrv5_mobile_rec — PPOCRV5MobileRecModel (PPOCRV5MobileRecConfig model)
- pp_ocrv5_server_rec — PPOCRV5ServerRecModel (PPOCRV5ServerRecConfig model)
- prophetnet — ProphetNetModel (ProphetNetConfig model)
- pvt — PvtModel (PvtConfig model)
- pvt_v2 — PvtV2Model (PvtV2Config model)
- qwen2 — Qwen2Model (Qwen2Config model)
- qwen2_5_vl — Qwen2_5_VLModel (Qwen2_5_VLConfig model)
- qwen2_5_vl_text — Qwen2_5_VLTextModel (Qwen2_5_VLTextConfig model)
- qwen2_audio_encoder — Qwen2AudioEncoder (Qwen2AudioEncoderConfig model)
- qwen2_moe — Qwen2MoeModel (Qwen2MoeConfig model)
- qwen2_vl — Qwen2VLModel (Qwen2VLConfig model)
- qwen2_vl_text — Qwen2VLTextModel (Qwen2VLTextConfig model)
- qwen3 — Qwen3Model (Qwen3Config model)
- qwen3_5 — Qwen3_5Model (Qwen3_5Config model)
- qwen3_5_moe — Qwen3_5MoeModel (Qwen3_5MoeConfig model)
- qwen3_5_moe_text — Qwen3_5MoeTextModel (Qwen3_5MoeTextConfig model)
- qwen3_5_text — Qwen3_5TextModel (Qwen3_5TextConfig model)
- qwen3_moe — Qwen3MoeModel (Qwen3MoeConfig model)
- qwen3_next — Qwen3NextModel (Qwen3NextConfig model)
- qwen3_vl — Qwen3VLModel (Qwen3VLConfig model)
- qwen3_vl_moe — Qwen3VLMoeModel (Qwen3VLMoeConfig model)
- qwen3_vl_moe_text — Qwen3VLMoeTextModel (Qwen3VLMoeTextConfig model)
- qwen3_vl_text — Qwen3VLTextModel (Qwen3VLTextConfig model)
- recurrent_gemma — RecurrentGemmaModel (RecurrentGemmaConfig model)
- reformer — ReformerModel (ReformerConfig model)
- regnet — RegNetModel (RegNetConfig model)
- rembert — RemBertModel (RemBertConfig model)
- resnet — ResNetModel (ResNetConfig model)
- roberta — RobertaModel (RobertaConfig model)
- roberta-prelayernorm — RobertaPreLayerNormModel (RobertaPreLayerNormConfig model)
- roc_bert — RoCBertModel (RoCBertConfig model)
- roformer — RoFormerModel (RoFormerConfig model)
- rt_detr — RTDetrModel (RTDetrConfig model)
- rt_detr_v2 — RTDetrV2Model (RTDetrV2Config model)
- rwkv — RwkvModel (RwkvConfig model)
- sam — SamModel (SamConfig model)
- sam2 — Sam2Model (Sam2Config model)
- sam2_hiera_det_model — Sam2HieraDetModel (Sam2HieraDetConfig model)
- sam2_video — Sam2VideoModel (Sam2VideoConfig model)
- sam2_vision_model — Sam2VisionModel (Sam2VisionConfig model)
- sam3 — Sam3Model (Sam3Config model)
- sam3_lite_text — Sam3LiteTextModel (Sam3LiteTextConfig model)
- sam3_lite_text_text_model — Sam3LiteTextTextModel (Sam3LiteTextTextConfig model)
- sam3_tracker — Sam3TrackerModel (Sam3TrackerConfig model)
- sam3_tracker_video — Sam3TrackerVideoModel (Sam3TrackerVideoConfig model)
- sam3_video — Sam3VideoModel (Sam3VideoConfig model)
- sam3_vision_model — Sam3VisionModel (Sam3VisionConfig model)
- sam3_vit_model — Sam3ViTModel (Sam3ViTConfig model)
- sam_hq — SamHQModel (SamHQConfig model)
- sam_hq_vision_model — SamHQVisionModel (SamHQVisionConfig model)
- sam_vision_model — SamVisionModel (SamVisionConfig model)
- seamless_m4t — SeamlessM4TModel (SeamlessM4TConfig model)
- seamless_m4t_v2 — SeamlessM4Tv2Model (SeamlessM4Tv2Config model)
- seed_oss — SeedOssModel (SeedOssConfig model)
- segformer — SegformerModel (SegformerConfig model)
- seggpt — SegGptModel (SegGptConfig model)
- sew — SEWModel (SEWConfig model)
- sew-d — SEWDModel (SEWDConfig model)
- siglip — SiglipModel (SiglipConfig model)
- siglip2 — Siglip2Model (Siglip2Config model)
- siglip2_vision_model — Siglip2VisionModel (Siglip2VisionConfig model)
- siglip_vision_model — SiglipVisionModel (SiglipVisionConfig model)
- smollm3 — SmolLM3Model (SmolLM3Config model)
- smolvlm — SmolVLMModel (SmolVLMConfig model)
- smolvlm_vision — SmolVLMVisionTransformer (SmolVLMVisionConfig model)
- solar_open — SolarOpenModel (SolarOpenConfig model)
- speech_to_text — Speech2TextModel (Speech2TextConfig model)
- speecht5 — SpeechT5Model (SpeechT5Config model)
- splinter — SplinterModel (SplinterConfig model)
- squeezebert — SqueezeBertModel (SqueezeBertConfig model)
- stablelm — StableLmModel (StableLmConfig model)
- starcoder2 — Starcoder2Model (Starcoder2Config model)
- swiftformer — SwiftFormerModel (SwiftFormerConfig model)
- swin — SwinModel (SwinConfig model)
- swin2sr — Swin2SRModel (Swin2SRConfig model)
- swinv2 — Swinv2Model (Swinv2Config model)
- switch_transformers — SwitchTransformersModel (SwitchTransformersConfig model)
- t5 — T5Model (T5Config model)
- t5gemma — T5GemmaModel (T5GemmaConfig model)
- t5gemma2 — T5Gemma2Model (T5Gemma2Config model)
- t5gemma2_encoder —
T5Gemma2Encoder(T5Gemma2EncoderConfig model) - table-transformer — TableTransformerModel (TableTransformerConfig model)
- tapas — TapasModel (TapasConfig model)
- textnet — TextNetModel (TextNetConfig model)
- time_series_transformer — TimeSeriesTransformerModel (TimeSeriesTransformerConfig model)
- timesfm — TimesFmModel (TimesFmConfig model)
- timesfm2_5 — TimesFm2_5Model (TimesFm2_5Config model)
- timesformer — TimesformerModel (TimesformerConfig model)
- timm_backbone — TimmBackbone (TimmBackboneConfig model)
- timm_wrapper — TimmWrapperModel (TimmWrapperConfig model)
- tvp — TvpModel (TvpConfig model)
- udop — UdopModel (UdopConfig model)
- umt5 — UMT5Model (UMT5Config model)
- unispeech — UniSpeechModel (UniSpeechConfig model)
- unispeech-sat — UniSpeechSatModel (UniSpeechSatConfig model)
- univnet — UnivNetModel (UnivNetConfig model)
- uvdoc — UVDocModel (UVDocConfig model)
- vaultgemma — VaultGemmaModel (VaultGemmaConfig model)
- vibevoice_acoustic_tokenizer — VibeVoiceAcousticTokenizerModel (VibeVoiceAcousticTokenizerConfig model)
- vibevoice_acoustic_tokenizer_decoder — VibeVoiceAcousticTokenizerDecoderModel (VibeVoiceAcousticTokenizerDecoderConfig model)
- vibevoice_acoustic_tokenizer_encoder — VibeVoiceAcousticTokenizerEncoderModel (VibeVoiceAcousticTokenizerEncoderConfig model)
- vibevoice_asr — VibeVoiceAsrForConditionalGeneration (VibeVoiceAsrConfig model)
- video_llama_3 — VideoLlama3Model (VideoLlama3Config model)
- video_llama_3_vision — VideoLlama3VisionModel (VideoLlama3VisionConfig model)
- video_llava — VideoLlavaModel (VideoLlavaConfig model)
- videomae — VideoMAEModel (VideoMAEConfig model)
- vilt — ViltModel (ViltConfig model)
- vipllava — VipLlavaModel (VipLlavaConfig model)
- vision-text-dual-encoder — VisionTextDualEncoderModel (VisionTextDualEncoderConfig model)
- visual_bert — VisualBertModel (VisualBertConfig model)
- vit — ViTModel (ViTConfig model)
- vit_mae — ViTMAEModel (ViTMAEConfig model)
- vit_msn — ViTMSNModel (ViTMSNConfig model)
- vitdet — VitDetModel (VitDetConfig model)
- vits — VitsModel (VitsConfig model)
- vivit — VivitModel (VivitConfig model)
- vjepa2 — VJEPA2Model (VJEPA2Config model)
- voxtral — VoxtralForConditionalGeneration (VoxtralConfig model)
- voxtral_encoder — VoxtralEncoder (VoxtralEncoderConfig model)
- voxtral_realtime — VoxtralRealtimeForConditionalGeneration (VoxtralRealtimeConfig model)
- voxtral_realtime_encoder — VoxtralRealtimeEncoder (VoxtralRealtimeEncoderConfig model)
- voxtral_realtime_text —
VoxtralRealtimeTextModel(VoxtralRealtimeTextConfig model) - wav2vec2 — Wav2Vec2Model (Wav2Vec2Config model)
- wav2vec2-bert — Wav2Vec2BertModel (Wav2Vec2BertConfig model)
- wav2vec2-conformer — Wav2Vec2ConformerModel (Wav2Vec2ConformerConfig model)
- wavlm — WavLMModel (WavLMConfig model)
- whisper — WhisperModel (WhisperConfig model)
- xclip — XCLIPModel (XCLIPConfig model)
- xcodec — XcodecModel (XcodecConfig model)
- xglm — XGLMModel (XGLMConfig model)
- xlm — XLMModel (XLMConfig model)
- xlm-roberta — XLMRobertaModel (XLMRobertaConfig model)
- xlm-roberta-xl — XLMRobertaXLModel (XLMRobertaXLConfig model)
- xlnet — XLNetModel (XLNetConfig model)
- xlstm — xLSTMModel (xLSTMConfig model)
- xmod — XmodModel (XmodConfig model)
- yolos — YolosModel (YolosConfig model)
- yoso — YosoModel (YosoConfig model)
- youtu — YoutuModel (YoutuConfig model)
- zamba — ZambaModel (ZambaConfig model)
- zamba2 — Zamba2Model (Zamba2Config model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModel
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModel.from_pretrained("google-bert/bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModel.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
TrueGeneric pretraining classes
The following auto classes are available for instantiating a model with a pretraining head.
AutoModelForPreTraining
This is a generic model class that will be instantiated as one of the model classes of the library (with a pretraining head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- AlbertConfig configuration class:
AlbertForPreTraining(AlbertConfig model) - AudioFlamingo3Config configuration class: AudioFlamingo3ForConditionalGeneration (AudioFlamingo3Config model)
- BartConfig configuration class: BartForConditionalGeneration (BartConfig model)
- BertConfig configuration class: BertForPreTraining (BertConfig model)
- BigBirdConfig configuration class: BigBirdForPreTraining (BigBirdConfig model)
- BloomConfig configuration class: BloomForCausalLM (BloomConfig model)
- CTRLConfig configuration class: CTRLLMHeadModel (CTRLConfig model)
- CamembertConfig configuration class: CamembertForMaskedLM (CamembertConfig model)
- ColModernVBertConfig configuration class: ColModernVBertForRetrieval (ColModernVBertConfig model)
- ColPaliConfig configuration class: ColPaliForRetrieval (ColPaliConfig model)
- ColQwen2Config configuration class: ColQwen2ForRetrieval (ColQwen2Config model)
- Data2VecTextConfig configuration class: Data2VecTextForMaskedLM (Data2VecTextConfig model)
- DebertaConfig configuration class: DebertaForMaskedLM (DebertaConfig model)
- DebertaV2Config configuration class: DebertaV2ForMaskedLM (DebertaV2Config model)
- DistilBertConfig configuration class: DistilBertForMaskedLM (DistilBertConfig model)
- ElectraConfig configuration class: ElectraForPreTraining (ElectraConfig model)
- ErnieConfig configuration class: ErnieForPreTraining (ErnieConfig model)
- EvollaConfig configuration class: EvollaForProteinText2Text (EvollaConfig model)
- Exaone4Config configuration class: Exaone4ForCausalLM (Exaone4Config model)
- ExaoneMoeConfig configuration class: ExaoneMoeForCausalLM (ExaoneMoeConfig model)
- FNetConfig configuration class: FNetForPreTraining (FNetConfig model)
- FSMTConfig configuration class: FSMTForConditionalGeneration (FSMTConfig model)
- FalconMambaConfig configuration class: FalconMambaForCausalLM (FalconMambaConfig model)
- FlaubertConfig configuration class: FlaubertWithLMHeadModel (FlaubertConfig model)
- FlavaConfig configuration class: FlavaForPreTraining (FlavaConfig model)
- Florence2Config configuration class: Florence2ForConditionalGeneration (Florence2Config model)
- FunnelConfig configuration class: FunnelForPreTraining (FunnelConfig model)
- GPT2Config configuration class: GPT2LMHeadModel (GPT2Config model)
- GPTBigCodeConfig configuration class: GPTBigCodeForCausalLM (GPTBigCodeConfig model)
- Gemma3Config configuration class: Gemma3ForConditionalGeneration (Gemma3Config model)
- Gemma4Config configuration class: Gemma4ForConditionalGeneration (Gemma4Config model)
- GlmAsrConfig configuration class: GlmAsrForConditionalGeneration (GlmAsrConfig model)
- HieraConfig configuration class: HieraForPreTraining (HieraConfig model)
- IBertConfig configuration class: IBertForMaskedLM (IBertConfig model)
- Idefics2Config configuration class: Idefics2ForConditionalGeneration (Idefics2Config model)
- Idefics3Config configuration class: Idefics3ForConditionalGeneration (Idefics3Config model)
- IdeficsConfig configuration class: IdeficsForVisionText2Text (IdeficsConfig model)
- JanusConfig configuration class: JanusForConditionalGeneration (JanusConfig model)
- LayoutLMConfig configuration class: LayoutLMForMaskedLM (LayoutLMConfig model)
- LlavaConfig configuration class: LlavaForConditionalGeneration (LlavaConfig model)
- LlavaNextConfig configuration class: LlavaNextForConditionalGeneration (LlavaNextConfig model)
- LlavaNextVideoConfig configuration class: LlavaNextVideoForConditionalGeneration (LlavaNextVideoConfig model)
- LlavaOnevisionConfig configuration class: LlavaOnevisionForConditionalGeneration (LlavaOnevisionConfig model)
- LongformerConfig configuration class: LongformerForMaskedLM (LongformerConfig model)
- LukeConfig configuration class: LukeForMaskedLM (LukeConfig model)
- LxmertConfig configuration class: LxmertForPreTraining (LxmertConfig model)
- MPNetConfig configuration class: MPNetForMaskedLM (MPNetConfig model)
- Mamba2Config configuration class: Mamba2ForCausalLM (Mamba2Config model)
- MambaConfig configuration class: MambaForCausalLM (MambaConfig model)
- MegatronBertConfig configuration class: MegatronBertForPreTraining (MegatronBertConfig model)
- Mistral3Config configuration class: Mistral3ForConditionalGeneration (Mistral3Config model)
- Mistral4Config configuration class: Mistral4ForCausalLM (Mistral4Config model)
- MllamaConfig configuration class: MllamaForConditionalGeneration (MllamaConfig model)
- MobileBertConfig configuration class: MobileBertForPreTraining (MobileBertConfig model)
- MptConfig configuration class: MptForCausalLM (MptConfig model)
- MraConfig configuration class: MraForMaskedLM (MraConfig model)
- MusicFlamingoConfig configuration class: MusicFlamingoForConditionalGeneration (MusicFlamingoConfig model)
- MvpConfig configuration class: MvpForConditionalGeneration (MvpConfig model)
- NanoChatConfig configuration class: NanoChatForCausalLM (NanoChatConfig model)
- NllbMoeConfig configuration class: NllbMoeForConditionalGeneration (NllbMoeConfig model)
- OpenAIGPTConfig configuration class: OpenAIGPTLMHeadModel (OpenAIGPTConfig model)
- PaliGemmaConfig configuration class: PaliGemmaForConditionalGeneration (PaliGemmaConfig model)
- Qwen2AudioConfig configuration class: Qwen2AudioForConditionalGeneration (Qwen2AudioConfig model)
- RoCBertConfig configuration class: RoCBertForPreTraining (RoCBertConfig model)
- RobertaConfig configuration class: RobertaForMaskedLM (RobertaConfig model)
- RobertaPreLayerNormConfig configuration class: RobertaPreLayerNormForMaskedLM (RobertaPreLayerNormConfig model)
- RwkvConfig configuration class: RwkvForCausalLM (RwkvConfig model)
- SplinterConfig configuration class: SplinterForPreTraining (SplinterConfig model)
- SqueezeBertConfig configuration class: SqueezeBertForMaskedLM (SqueezeBertConfig model)
- SwitchTransformersConfig configuration class: SwitchTransformersForConditionalGeneration (SwitchTransformersConfig model)
- T5Config configuration class: T5ForConditionalGeneration (T5Config model)
- T5Gemma2Config configuration class: T5Gemma2ForConditionalGeneration (T5Gemma2Config model)
- T5GemmaConfig configuration class: T5GemmaForConditionalGeneration (T5GemmaConfig model)
- TapasConfig configuration class: TapasForMaskedLM (TapasConfig model)
- UniSpeechConfig configuration class: UniSpeechForPreTraining (UniSpeechConfig model)
- UniSpeechSatConfig configuration class: UniSpeechSatForPreTraining (UniSpeechSatConfig model)
- ViTMAEConfig configuration class: ViTMAEForPreTraining (ViTMAEConfig model)
- VibeVoiceAsrConfig configuration class: VibeVoiceAsrForConditionalGeneration (VibeVoiceAsrConfig model)
- VideoLlavaConfig configuration class: VideoLlavaForConditionalGeneration (VideoLlavaConfig model)
- VideoMAEConfig configuration class: VideoMAEForPreTraining (VideoMAEConfig model)
- VipLlavaConfig configuration class: VipLlavaForConditionalGeneration (VipLlavaConfig model)
- VisualBertConfig configuration class: VisualBertForPreTraining (VisualBertConfig model)
- VoxtralConfig configuration class: VoxtralForConditionalGeneration (VoxtralConfig model)
- VoxtralRealtimeConfig configuration class: VoxtralRealtimeForConditionalGeneration (VoxtralRealtimeConfig model)
- Wav2Vec2Config configuration class: Wav2Vec2ForPreTraining (Wav2Vec2Config model)
- Wav2Vec2ConformerConfig configuration class: Wav2Vec2ConformerForPreTraining (Wav2Vec2ConformerConfig model)
- XLMConfig configuration class: XLMWithLMHeadModel (XLMConfig model)
- XLMRobertaConfig configuration class: XLMRobertaForMaskedLM (XLMRobertaConfig model)
- XLMRobertaXLConfig configuration class: XLMRobertaXLForMaskedLM (XLMRobertaXLConfig model)
- XLNetConfig configuration class: XLNetLMHeadModel (XLNetConfig model)
- XmodConfig configuration class: XmodForMaskedLM (XmodConfig model)
- xLSTMConfig configuration class: xLSTMForCausalLM (xLSTMConfig model)
- AlbertConfig configuration class:
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the model classes of the library (with a pretraining head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the model classes of the library (with a pretraining head) from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- albert —
AlbertForPreTraining(AlbertConfig model) - audioflamingo3 — AudioFlamingo3ForConditionalGeneration (AudioFlamingo3Config model)
- bart — BartForConditionalGeneration (BartConfig model)
- bert — BertForPreTraining (BertConfig model)
- big_bird — BigBirdForPreTraining (BigBirdConfig model)
- bloom — BloomForCausalLM (BloomConfig model)
- camembert — CamembertForMaskedLM (CamembertConfig model)
- colmodernvbert — ColModernVBertForRetrieval (ColModernVBertConfig model)
- colpali — ColPaliForRetrieval (ColPaliConfig model)
- colqwen2 — ColQwen2ForRetrieval (ColQwen2Config model)
- ctrl — CTRLLMHeadModel (CTRLConfig model)
- data2vec-text — Data2VecTextForMaskedLM (Data2VecTextConfig model)
- deberta — DebertaForMaskedLM (DebertaConfig model)
- deberta-v2 — DebertaV2ForMaskedLM (DebertaV2Config model)
- distilbert — DistilBertForMaskedLM (DistilBertConfig model)
- electra — ElectraForPreTraining (ElectraConfig model)
- ernie — ErnieForPreTraining (ErnieConfig model)
- evolla — EvollaForProteinText2Text (EvollaConfig model)
- exaone4 — Exaone4ForCausalLM (Exaone4Config model)
- exaone_moe — ExaoneMoeForCausalLM (ExaoneMoeConfig model)
- falcon_mamba — FalconMambaForCausalLM (FalconMambaConfig model)
- flaubert — FlaubertWithLMHeadModel (FlaubertConfig model)
- flava — FlavaForPreTraining (FlavaConfig model)
- florence2 — Florence2ForConditionalGeneration (Florence2Config model)
- fnet — FNetForPreTraining (FNetConfig model)
- fsmt — FSMTForConditionalGeneration (FSMTConfig model)
- funnel — FunnelForPreTraining (FunnelConfig model)
- gemma3 — Gemma3ForConditionalGeneration (Gemma3Config model)
- gemma4 — Gemma4ForConditionalGeneration (Gemma4Config model)
- glmasr — GlmAsrForConditionalGeneration (GlmAsrConfig model)
- gpt-sw3 — GPT2LMHeadModel (GPT2Config model)
- gpt2 — GPT2LMHeadModel (GPT2Config model)
- gpt_bigcode — GPTBigCodeForCausalLM (GPTBigCodeConfig model)
- hiera — HieraForPreTraining (HieraConfig model)
- ibert — IBertForMaskedLM (IBertConfig model)
- idefics — IdeficsForVisionText2Text (IdeficsConfig model)
- idefics2 — Idefics2ForConditionalGeneration (Idefics2Config model)
- idefics3 — Idefics3ForConditionalGeneration (Idefics3Config model)
- janus — JanusForConditionalGeneration (JanusConfig model)
- layoutlm — LayoutLMForMaskedLM (LayoutLMConfig model)
- llava — LlavaForConditionalGeneration (LlavaConfig model)
- llava_next — LlavaNextForConditionalGeneration (LlavaNextConfig model)
- llava_next_video — LlavaNextVideoForConditionalGeneration (LlavaNextVideoConfig model)
- llava_onevision — LlavaOnevisionForConditionalGeneration (LlavaOnevisionConfig model)
- longformer — LongformerForMaskedLM (LongformerConfig model)
- luke — LukeForMaskedLM (LukeConfig model)
- lxmert — LxmertForPreTraining (LxmertConfig model)
- mamba — MambaForCausalLM (MambaConfig model)
- mamba2 — Mamba2ForCausalLM (Mamba2Config model)
- megatron-bert — MegatronBertForPreTraining (MegatronBertConfig model)
- mistral3 — Mistral3ForConditionalGeneration (Mistral3Config model)
- mistral4 — Mistral4ForCausalLM (Mistral4Config model)
- mllama — MllamaForConditionalGeneration (MllamaConfig model)
- mobilebert — MobileBertForPreTraining (MobileBertConfig model)
- mpnet — MPNetForMaskedLM (MPNetConfig model)
- mpt — MptForCausalLM (MptConfig model)
- mra — MraForMaskedLM (MraConfig model)
- musicflamingo — MusicFlamingoForConditionalGeneration (MusicFlamingoConfig model)
- mvp — MvpForConditionalGeneration (MvpConfig model)
- nanochat — NanoChatForCausalLM (NanoChatConfig model)
- nllb-moe — NllbMoeForConditionalGeneration (NllbMoeConfig model)
- openai-gpt — OpenAIGPTLMHeadModel (OpenAIGPTConfig model)
- paligemma — PaliGemmaForConditionalGeneration (PaliGemmaConfig model)
- qwen2_audio — Qwen2AudioForConditionalGeneration (Qwen2AudioConfig model)
- roberta — RobertaForMaskedLM (RobertaConfig model)
- roberta-prelayernorm — RobertaPreLayerNormForMaskedLM (RobertaPreLayerNormConfig model)
- roc_bert — RoCBertForPreTraining (RoCBertConfig model)
- rwkv — RwkvForCausalLM (RwkvConfig model)
- splinter — SplinterForPreTraining (SplinterConfig model)
- squeezebert — SqueezeBertForMaskedLM (SqueezeBertConfig model)
- switch_transformers — SwitchTransformersForConditionalGeneration (SwitchTransformersConfig model)
- t5 — T5ForConditionalGeneration (T5Config model)
- t5gemma — T5GemmaForConditionalGeneration (T5GemmaConfig model)
- t5gemma2 — T5Gemma2ForConditionalGeneration (T5Gemma2Config model)
- tapas — TapasForMaskedLM (TapasConfig model)
- unispeech — UniSpeechForPreTraining (UniSpeechConfig model)
- unispeech-sat — UniSpeechSatForPreTraining (UniSpeechSatConfig model)
- vibevoice_asr — VibeVoiceAsrForConditionalGeneration (VibeVoiceAsrConfig model)
- video_llava — VideoLlavaForConditionalGeneration (VideoLlavaConfig model)
- videomae — VideoMAEForPreTraining (VideoMAEConfig model)
- vipllava — VipLlavaForConditionalGeneration (VipLlavaConfig model)
- visual_bert — VisualBertForPreTraining (VisualBertConfig model)
- vit_mae — ViTMAEForPreTraining (ViTMAEConfig model)
- voxtral — VoxtralForConditionalGeneration (VoxtralConfig model)
- voxtral_realtime — VoxtralRealtimeForConditionalGeneration (VoxtralRealtimeConfig model)
- wav2vec2 — Wav2Vec2ForPreTraining (Wav2Vec2Config model)
- wav2vec2-conformer — Wav2Vec2ConformerForPreTraining (Wav2Vec2ConformerConfig model)
- xlm — XLMWithLMHeadModel (XLMConfig model)
- xlm-roberta — XLMRobertaForMaskedLM (XLMRobertaConfig model)
- xlm-roberta-xl — XLMRobertaXLForMaskedLM (XLMRobertaXLConfig model)
- xlnet — XLNetLMHeadModel (XLNetConfig model)
- xlstm — xLSTMForCausalLM (xLSTMConfig model)
- xmod — XmodForMaskedLM (XmodConfig model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModelForPreTraining
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForPreTraining.from_pretrained("google-bert/bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForPreTraining.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
TrueNatural Language Processing
The following auto classes are available for the following natural language processing tasks.
AutoModelForCausalLM
This is a generic model class that will be instantiated as one of the model classes of the library (with a causal language modeling head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- AfmoeConfig configuration class: AfmoeForCausalLM (AfmoeConfig model)
- ApertusConfig configuration class: ApertusForCausalLM (ApertusConfig model)
- ArceeConfig configuration class: ArceeForCausalLM (ArceeConfig model)
- AriaTextConfig configuration class: AriaTextForCausalLM (AriaTextConfig model)
- BambaConfig configuration class: BambaForCausalLM (BambaConfig model)
- BartConfig configuration class: BartForCausalLM (BartConfig model)
- BertConfig configuration class: BertLMHeadModel (BertConfig model)
- BertGenerationConfig configuration class: BertGenerationDecoder (BertGenerationConfig model)
- BigBirdConfig configuration class: BigBirdForCausalLM (BigBirdConfig model)
- BigBirdPegasusConfig configuration class: BigBirdPegasusForCausalLM (BigBirdPegasusConfig model)
- BioGptConfig configuration class: BioGptForCausalLM (BioGptConfig model)
- BitNetConfig configuration class: BitNetForCausalLM (BitNetConfig model)
- BlenderbotConfig configuration class: BlenderbotForCausalLM (BlenderbotConfig model)
- BlenderbotSmallConfig configuration class: BlenderbotSmallForCausalLM (BlenderbotSmallConfig model)
- BloomConfig configuration class: BloomForCausalLM (BloomConfig model)
- BltConfig configuration class: BltForCausalLM (BltConfig model)
- CTRLConfig configuration class: CTRLLMHeadModel (CTRLConfig model)
- CamembertConfig configuration class: CamembertForCausalLM (CamembertConfig model)
- CodeGenConfig configuration class: CodeGenForCausalLM (CodeGenConfig model)
- Cohere2Config configuration class: Cohere2ForCausalLM (Cohere2Config model)
- CohereConfig configuration class: CohereForCausalLM (CohereConfig model)
- CpmAntConfig configuration class: CpmAntForCausalLM (CpmAntConfig model)
- CwmConfig configuration class: CwmForCausalLM (CwmConfig model)
- Data2VecTextConfig configuration class: Data2VecTextForCausalLM (Data2VecTextConfig model)
- DbrxConfig configuration class: DbrxForCausalLM (DbrxConfig model)
- DeepseekV2Config configuration class: DeepseekV2ForCausalLM (DeepseekV2Config model)
- DeepseekV3Config configuration class: DeepseekV3ForCausalLM (DeepseekV3Config model)
- DiffLlamaConfig configuration class: DiffLlamaForCausalLM (DiffLlamaConfig model)
- DogeConfig configuration class: DogeForCausalLM (DogeConfig model)
- Dots1Config configuration class: Dots1ForCausalLM (Dots1Config model)
- ElectraConfig configuration class: ElectraForCausalLM (ElectraConfig model)
- Emu3Config configuration class: Emu3ForCausalLM (Emu3Config model)
- Ernie4_5Config configuration class: Ernie4_5ForCausalLM (Ernie4_5Config model)
- Ernie4_5_MoeConfig configuration class: Ernie4_5_MoeForCausalLM (Ernie4_5_MoeConfig model)
- ErnieConfig configuration class: ErnieForCausalLM (ErnieConfig model)
- Exaone4Config configuration class: Exaone4ForCausalLM (Exaone4Config model)
- ExaoneMoeConfig configuration class: ExaoneMoeForCausalLM (ExaoneMoeConfig model)
- FalconConfig configuration class: FalconForCausalLM (FalconConfig model)
- FalconH1Config configuration class: FalconH1ForCausalLM (FalconH1Config model)
- FalconMambaConfig configuration class: FalconMambaForCausalLM (FalconMambaConfig model)
- FlexOlmoConfig configuration class: FlexOlmoForCausalLM (FlexOlmoConfig model)
- FuyuConfig configuration class: FuyuForCausalLM (FuyuConfig model)
- GPT2Config configuration class: GPT2LMHeadModel (GPT2Config model)
- GPTBigCodeConfig configuration class: GPTBigCodeForCausalLM (GPTBigCodeConfig model)
- GPTJConfig configuration class: GPTJForCausalLM (GPTJConfig model)
- GPTNeoConfig configuration class: GPTNeoForCausalLM (GPTNeoConfig model)
- GPTNeoXConfig configuration class: GPTNeoXForCausalLM (GPTNeoXConfig model)
- GPTNeoXJapaneseConfig configuration class: GPTNeoXJapaneseForCausalLM (GPTNeoXJapaneseConfig model)
- Gemma2Config configuration class: Gemma2ForCausalLM (Gemma2Config model)
- Gemma3Config configuration class: Gemma3ForConditionalGeneration (Gemma3Config model)
- Gemma3TextConfig configuration class: Gemma3ForCausalLM (Gemma3TextConfig model)
- Gemma3nConfig configuration class: Gemma3nForConditionalGeneration (Gemma3nConfig model)
- Gemma3nTextConfig configuration class: Gemma3nForCausalLM (Gemma3nTextConfig model)
- Gemma4Config configuration class: Gemma4ForConditionalGeneration (Gemma4Config model)
- Gemma4TextConfig configuration class: Gemma4ForCausalLM (Gemma4TextConfig model)
- GemmaConfig configuration class: GemmaForCausalLM (GemmaConfig model)
- GitConfig configuration class: GitForCausalLM (GitConfig model)
- Glm4Config configuration class: Glm4ForCausalLM (Glm4Config model)
- Glm4MoeConfig configuration class: Glm4MoeForCausalLM (Glm4MoeConfig model)
- Glm4MoeLiteConfig configuration class: Glm4MoeLiteForCausalLM (Glm4MoeLiteConfig model)
- GlmConfig configuration class: GlmForCausalLM (GlmConfig model)
- GlmMoeDsaConfig configuration class: GlmMoeDsaForCausalLM (GlmMoeDsaConfig model)
- GotOcr2Config configuration class: GotOcr2ForConditionalGeneration (GotOcr2Config model)
- GptOssConfig configuration class: GptOssForCausalLM (GptOssConfig model)
- GraniteConfig configuration class: GraniteForCausalLM (GraniteConfig model)
- GraniteMoeConfig configuration class: GraniteMoeForCausalLM (GraniteMoeConfig model)
- GraniteMoeHybridConfig configuration class: GraniteMoeHybridForCausalLM (GraniteMoeHybridConfig model)
- GraniteMoeSharedConfig configuration class: GraniteMoeSharedForCausalLM (GraniteMoeSharedConfig model)
- HeliumConfig configuration class: HeliumForCausalLM (HeliumConfig model)
- HunYuanDenseV1Config configuration class: HunYuanDenseV1ForCausalLM (HunYuanDenseV1Config model)
- HunYuanMoEV1Config configuration class: HunYuanMoEV1ForCausalLM (HunYuanMoEV1Config model)
- Jais2Config configuration class: Jais2ForCausalLM (Jais2Config model)
- JambaConfig configuration class: JambaForCausalLM (JambaConfig model)
- JetMoeConfig configuration class: JetMoeForCausalLM (JetMoeConfig model)
- Lfm2Config configuration class: Lfm2ForCausalLM (Lfm2Config model)
- Lfm2MoeConfig configuration class: Lfm2MoeForCausalLM (Lfm2MoeConfig model)
- Llama4Config configuration class: Llama4ForCausalLM (Llama4Config model)
- Llama4TextConfig configuration class: Llama4ForCausalLM (Llama4TextConfig model)
- LlamaConfig configuration class: LlamaForCausalLM (LlamaConfig model)
- LongcatFlashConfig configuration class: LongcatFlashForCausalLM (LongcatFlashConfig model)
- MBartConfig configuration class: MBartForCausalLM (MBartConfig model)
- Mamba2Config configuration class: Mamba2ForCausalLM (Mamba2Config model)
- MambaConfig configuration class: MambaForCausalLM (MambaConfig model)
- MarianConfig configuration class: MarianForCausalLM (MarianConfig model)
- MegatronBertConfig configuration class: MegatronBertForCausalLM (MegatronBertConfig model)
- MiniMaxConfig configuration class: MiniMaxForCausalLM (MiniMaxConfig model)
- MiniMaxM2Config configuration class: MiniMaxM2ForCausalLM (MiniMaxM2Config model)
- Ministral3Config configuration class: Ministral3ForCausalLM (Ministral3Config model)
- MinistralConfig configuration class: MinistralForCausalLM (MinistralConfig model)
- MistralConfig configuration class: MistralForCausalLM (MistralConfig model)
- MixtralConfig configuration class: MixtralForCausalLM (MixtralConfig model)
- MllamaConfig configuration class: MllamaForCausalLM (MllamaConfig model)
- ModernBertDecoderConfig configuration class: ModernBertDecoderForCausalLM (ModernBertDecoderConfig model)
- MoshiConfig configuration class: MoshiForCausalLM (MoshiConfig model)
- MptConfig configuration class: MptForCausalLM (MptConfig model)
- MusicgenConfig configuration class: MusicgenForCausalLM (MusicgenConfig model)
- MusicgenMelodyConfig configuration class: MusicgenMelodyForCausalLM (MusicgenMelodyConfig model)
- MvpConfig configuration class: MvpForCausalLM (MvpConfig model)
- NanoChatConfig configuration class: NanoChatForCausalLM (NanoChatConfig model)
- NemotronConfig configuration class: NemotronForCausalLM (NemotronConfig model)
- NemotronHConfig configuration class: NemotronHForCausalLM (NemotronHConfig model)
- OPTConfig configuration class: OPTForCausalLM (OPTConfig model)
- Olmo2Config configuration class: Olmo2ForCausalLM (Olmo2Config model)
- Olmo3Config configuration class: Olmo3ForCausalLM (Olmo3Config model)
- OlmoConfig configuration class: OlmoForCausalLM (OlmoConfig model)
- OlmoHybridConfig configuration class: OlmoHybridForCausalLM (OlmoHybridConfig model)
- OlmoeConfig configuration class: OlmoeForCausalLM (OlmoeConfig model)
- OpenAIGPTConfig configuration class: OpenAIGPTLMHeadModel (OpenAIGPTConfig model)
- PLBartConfig configuration class: PLBartForCausalLM (PLBartConfig model)
- PegasusConfig configuration class: PegasusForCausalLM (PegasusConfig model)
- PersimmonConfig configuration class: PersimmonForCausalLM (PersimmonConfig model)
- Phi3Config configuration class: Phi3ForCausalLM (Phi3Config model)
- Phi4MultimodalConfig configuration class: Phi4MultimodalForCausalLM (Phi4MultimodalConfig model)
- PhiConfig configuration class: PhiForCausalLM (PhiConfig model)
- PhimoeConfig configuration class: PhimoeForCausalLM (PhimoeConfig model)
- ProphetNetConfig configuration class: ProphetNetForCausalLM (ProphetNetConfig model)
- Qwen2Config configuration class: Qwen2ForCausalLM (Qwen2Config model)
- Qwen2MoeConfig configuration class: Qwen2MoeForCausalLM (Qwen2MoeConfig model)
- Qwen3Config configuration class: Qwen3ForCausalLM (Qwen3Config model)
- Qwen3MoeConfig configuration class: Qwen3MoeForCausalLM (Qwen3MoeConfig model)
- Qwen3NextConfig configuration class: Qwen3NextForCausalLM (Qwen3NextConfig model)
- Qwen3_5Config configuration class: Qwen3_5ForCausalLM (Qwen3_5Config model)
- Qwen3_5MoeConfig configuration class: Qwen3_5MoeForCausalLM (Qwen3_5MoeConfig model)
- Qwen3_5MoeTextConfig configuration class: Qwen3_5MoeForCausalLM (Qwen3_5MoeTextConfig model)
- Qwen3_5TextConfig configuration class: Qwen3_5ForCausalLM (Qwen3_5TextConfig model)
- RecurrentGemmaConfig configuration class: RecurrentGemmaForCausalLM (RecurrentGemmaConfig model)
- ReformerConfig configuration class: ReformerModelWithLMHead (ReformerConfig model)
- RemBertConfig configuration class: RemBertForCausalLM (RemBertConfig model)
- RoCBertConfig configuration class: RoCBertForCausalLM (RoCBertConfig model)
- RoFormerConfig configuration class: RoFormerForCausalLM (RoFormerConfig model)
- RobertaConfig configuration class: RobertaForCausalLM (RobertaConfig model)
- RobertaPreLayerNormConfig configuration class: RobertaPreLayerNormForCausalLM (RobertaPreLayerNormConfig model)
- RwkvConfig configuration class: RwkvForCausalLM (RwkvConfig model)
- SeedOssConfig configuration class: SeedOssForCausalLM (SeedOssConfig model)
- SmolLM3Config configuration class: SmolLM3ForCausalLM (SmolLM3Config model)
- SolarOpenConfig configuration class: SolarOpenForCausalLM (SolarOpenConfig model)
- StableLmConfig configuration class: StableLmForCausalLM (StableLmConfig model)
- Starcoder2Config configuration class: Starcoder2ForCausalLM (Starcoder2Config model)
- TrOCRConfig configuration class: TrOCRForCausalLM (TrOCRConfig model)
- VaultGemmaConfig configuration class: VaultGemmaForCausalLM (VaultGemmaConfig model)
- WhisperConfig configuration class: WhisperForCausalLM (WhisperConfig model)
- XGLMConfig configuration class: XGLMForCausalLM (XGLMConfig model)
- XLMConfig configuration class: XLMWithLMHeadModel (XLMConfig model)
- XLMRobertaConfig configuration class: XLMRobertaForCausalLM (XLMRobertaConfig model)
- XLMRobertaXLConfig configuration class: XLMRobertaXLForCausalLM (XLMRobertaXLConfig model)
- XLNetConfig configuration class: XLNetLMHeadModel (XLNetConfig model)
- XmodConfig configuration class: XmodForCausalLM (XmodConfig model)
- YoutuConfig configuration class: YoutuForCausalLM (YoutuConfig model)
- Zamba2Config configuration class: Zamba2ForCausalLM (Zamba2Config model)
- ZambaConfig configuration class: ZambaForCausalLM (ZambaConfig model)
- xLSTMConfig configuration class: xLSTMForCausalLM (xLSTMConfig model)
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the model classes of the library (with a causal language modeling head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the model classes of the library (with a causal language modeling head) from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- afmoe — AfmoeForCausalLM (AfmoeConfig model)
- apertus — ApertusForCausalLM (ApertusConfig model)
- arcee — ArceeForCausalLM (ArceeConfig model)
- aria_text — AriaTextForCausalLM (AriaTextConfig model)
- bamba — BambaForCausalLM (BambaConfig model)
- bart — BartForCausalLM (BartConfig model)
- bert — BertLMHeadModel (BertConfig model)
- bert-generation — BertGenerationDecoder (BertGenerationConfig model)
- big_bird — BigBirdForCausalLM (BigBirdConfig model)
- bigbird_pegasus — BigBirdPegasusForCausalLM (BigBirdPegasusConfig model)
- biogpt — BioGptForCausalLM (BioGptConfig model)
- bitnet — BitNetForCausalLM (BitNetConfig model)
- blenderbot — BlenderbotForCausalLM (BlenderbotConfig model)
- blenderbot-small — BlenderbotSmallForCausalLM (BlenderbotSmallConfig model)
- bloom — BloomForCausalLM (BloomConfig model)
- blt — BltForCausalLM (BltConfig model)
- camembert — CamembertForCausalLM (CamembertConfig model)
- codegen — CodeGenForCausalLM (CodeGenConfig model)
- cohere — CohereForCausalLM (CohereConfig model)
- cohere2 — Cohere2ForCausalLM (Cohere2Config model)
- cpmant — CpmAntForCausalLM (CpmAntConfig model)
- ctrl — CTRLLMHeadModel (CTRLConfig model)
- cwm — CwmForCausalLM (CwmConfig model)
- data2vec-text — Data2VecTextForCausalLM (Data2VecTextConfig model)
- dbrx — DbrxForCausalLM (DbrxConfig model)
- deepseek_v2 — DeepseekV2ForCausalLM (DeepseekV2Config model)
- deepseek_v3 — DeepseekV3ForCausalLM (DeepseekV3Config model)
- diffllama — DiffLlamaForCausalLM (DiffLlamaConfig model)
- doge — DogeForCausalLM (DogeConfig model)
- dots1 — Dots1ForCausalLM (Dots1Config model)
- electra — ElectraForCausalLM (ElectraConfig model)
- emu3 — Emu3ForCausalLM (Emu3Config model)
- ernie — ErnieForCausalLM (ErnieConfig model)
- ernie4_5 — Ernie4_5ForCausalLM (Ernie4_5Config model)
- ernie4_5_moe — Ernie4_5_MoeForCausalLM (Ernie4_5_MoeConfig model)
- exaone4 — Exaone4ForCausalLM (Exaone4Config model)
- exaone_moe — ExaoneMoeForCausalLM (ExaoneMoeConfig model)
- falcon — FalconForCausalLM (FalconConfig model)
- falcon_h1 — FalconH1ForCausalLM (FalconH1Config model)
- falcon_mamba — FalconMambaForCausalLM (FalconMambaConfig model)
- flex_olmo — FlexOlmoForCausalLM (FlexOlmoConfig model)
- fuyu — FuyuForCausalLM (FuyuConfig model)
- gemma — GemmaForCausalLM (GemmaConfig model)
- gemma2 — Gemma2ForCausalLM (Gemma2Config model)
- gemma3 — Gemma3ForConditionalGeneration (Gemma3Config model)
- gemma3_text — Gemma3ForCausalLM (Gemma3TextConfig model)
- gemma3n — Gemma3nForConditionalGeneration (Gemma3nConfig model)
- gemma3n_text — Gemma3nForCausalLM (Gemma3nTextConfig model)
- gemma4 — Gemma4ForConditionalGeneration (Gemma4Config model)
- gemma4_text — Gemma4ForCausalLM (Gemma4TextConfig model)
- git — GitForCausalLM (GitConfig model)
- glm — GlmForCausalLM (GlmConfig model)
- glm4 — Glm4ForCausalLM (Glm4Config model)
- glm4_moe — Glm4MoeForCausalLM (Glm4MoeConfig model)
- glm4_moe_lite — Glm4MoeLiteForCausalLM (Glm4MoeLiteConfig model)
- glm_moe_dsa — GlmMoeDsaForCausalLM (GlmMoeDsaConfig model)
- got_ocr2 — GotOcr2ForConditionalGeneration (GotOcr2Config model)
- gpt-sw3 — GPT2LMHeadModel (GPT2Config model)
- gpt2 — GPT2LMHeadModel (GPT2Config model)
- gpt_bigcode — GPTBigCodeForCausalLM (GPTBigCodeConfig model)
- gpt_neo — GPTNeoForCausalLM (GPTNeoConfig model)
- gpt_neox — GPTNeoXForCausalLM (GPTNeoXConfig model)
- gpt_neox_japanese — GPTNeoXJapaneseForCausalLM (GPTNeoXJapaneseConfig model)
- gpt_oss — GptOssForCausalLM (GptOssConfig model)
- gptj — GPTJForCausalLM (GPTJConfig model)
- granite — GraniteForCausalLM (GraniteConfig model)
- granitemoe — GraniteMoeForCausalLM (GraniteMoeConfig model)
- granitemoehybrid — GraniteMoeHybridForCausalLM (GraniteMoeHybridConfig model)
- granitemoeshared — GraniteMoeSharedForCausalLM (GraniteMoeSharedConfig model)
- helium — HeliumForCausalLM (HeliumConfig model)
- hunyuan_v1_dense — HunYuanDenseV1ForCausalLM (HunYuanDenseV1Config model)
- hunyuan_v1_moe — HunYuanMoEV1ForCausalLM (HunYuanMoEV1Config model)
- jais2 — Jais2ForCausalLM (Jais2Config model)
- jamba — JambaForCausalLM (JambaConfig model)
- jetmoe — JetMoeForCausalLM (JetMoeConfig model)
- lfm2 — Lfm2ForCausalLM (Lfm2Config model)
- lfm2_moe — Lfm2MoeForCausalLM (Lfm2MoeConfig model)
- llama — LlamaForCausalLM (LlamaConfig model)
- llama4 — Llama4ForCausalLM (Llama4Config model)
- llama4_text — Llama4ForCausalLM (Llama4TextConfig model)
- longcat_flash — LongcatFlashForCausalLM (LongcatFlashConfig model)
- mamba — MambaForCausalLM (MambaConfig model)
- mamba2 — Mamba2ForCausalLM (Mamba2Config model)
- marian — MarianForCausalLM (MarianConfig model)
- mbart — MBartForCausalLM (MBartConfig model)
- megatron-bert — MegatronBertForCausalLM (MegatronBertConfig model)
- minimax — MiniMaxForCausalLM (MiniMaxConfig model)
- minimax_m2 — MiniMaxM2ForCausalLM (MiniMaxM2Config model)
- ministral — MinistralForCausalLM (MinistralConfig model)
- ministral3 — Ministral3ForCausalLM (Ministral3Config model)
- mistral — MistralForCausalLM (MistralConfig model)
- mixtral — MixtralForCausalLM (MixtralConfig model)
- mllama — MllamaForCausalLM (MllamaConfig model)
- modernbert-decoder — ModernBertDecoderForCausalLM (ModernBertDecoderConfig model)
- moshi — MoshiForCausalLM (MoshiConfig model)
- mpt — MptForCausalLM (MptConfig model)
- musicgen — MusicgenForCausalLM (MusicgenConfig model)
- musicgen_melody — MusicgenMelodyForCausalLM (MusicgenMelodyConfig model)
- mvp — MvpForCausalLM (MvpConfig model)
- nanochat — NanoChatForCausalLM (NanoChatConfig model)
- nemotron — NemotronForCausalLM (NemotronConfig model)
- nemotron_h — NemotronHForCausalLM (NemotronHConfig model)
- olmo — OlmoForCausalLM (OlmoConfig model)
- olmo2 — Olmo2ForCausalLM (Olmo2Config model)
- olmo3 — Olmo3ForCausalLM (Olmo3Config model)
- olmo_hybrid — OlmoHybridForCausalLM (OlmoHybridConfig model)
- olmoe — OlmoeForCausalLM (OlmoeConfig model)
- openai-gpt — OpenAIGPTLMHeadModel (OpenAIGPTConfig model)
- opt — OPTForCausalLM (OPTConfig model)
- pegasus — PegasusForCausalLM (PegasusConfig model)
- persimmon — PersimmonForCausalLM (PersimmonConfig model)
- phi — PhiForCausalLM (PhiConfig model)
- phi3 — Phi3ForCausalLM (Phi3Config model)
- phi4_multimodal — Phi4MultimodalForCausalLM (Phi4MultimodalConfig model)
- phimoe — PhimoeForCausalLM (PhimoeConfig model)
- plbart — PLBartForCausalLM (PLBartConfig model)
- prophetnet — ProphetNetForCausalLM (ProphetNetConfig model)
- qwen2 — Qwen2ForCausalLM (Qwen2Config model)
- qwen2_moe — Qwen2MoeForCausalLM (Qwen2MoeConfig model)
- qwen3 — Qwen3ForCausalLM (Qwen3Config model)
- qwen3_5 — Qwen3_5ForCausalLM (Qwen3_5Config model)
- qwen3_5_moe — Qwen3_5MoeForCausalLM (Qwen3_5MoeConfig model)
- qwen3_5_moe_text — Qwen3_5MoeForCausalLM (Qwen3_5MoeTextConfig model)
- qwen3_5_text — Qwen3_5ForCausalLM (Qwen3_5TextConfig model)
- qwen3_moe — Qwen3MoeForCausalLM (Qwen3MoeConfig model)
- qwen3_next — Qwen3NextForCausalLM (Qwen3NextConfig model)
- recurrent_gemma — RecurrentGemmaForCausalLM (RecurrentGemmaConfig model)
- reformer — ReformerModelWithLMHead (ReformerConfig model)
- rembert — RemBertForCausalLM (RemBertConfig model)
- roberta — RobertaForCausalLM (RobertaConfig model)
- roberta-prelayernorm — RobertaPreLayerNormForCausalLM (RobertaPreLayerNormConfig model)
- roc_bert — RoCBertForCausalLM (RoCBertConfig model)
- roformer — RoFormerForCausalLM (RoFormerConfig model)
- rwkv — RwkvForCausalLM (RwkvConfig model)
- seed_oss — SeedOssForCausalLM (SeedOssConfig model)
- smollm3 — SmolLM3ForCausalLM (SmolLM3Config model)
- solar_open — SolarOpenForCausalLM (SolarOpenConfig model)
- stablelm — StableLmForCausalLM (StableLmConfig model)
- starcoder2 — Starcoder2ForCausalLM (Starcoder2Config model)
- trocr — TrOCRForCausalLM (TrOCRConfig model)
- vaultgemma — VaultGemmaForCausalLM (VaultGemmaConfig model)
- whisper — WhisperForCausalLM (WhisperConfig model)
- xglm — XGLMForCausalLM (XGLMConfig model)
- xlm — XLMWithLMHeadModel (XLMConfig model)
- xlm-roberta — XLMRobertaForCausalLM (XLMRobertaConfig model)
- xlm-roberta-xl — XLMRobertaXLForCausalLM (XLMRobertaXLConfig model)
- xlnet — XLNetLMHeadModel (XLNetConfig model)
- xlstm — xLSTMForCausalLM (xLSTMConfig model)
- xmod — XmodForCausalLM (XmodConfig model)
- youtu — YoutuForCausalLM (YoutuConfig model)
- zamba — ZambaForCausalLM (ZambaConfig model)
- zamba2 — Zamba2ForCausalLM (Zamba2Config model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModelForCausalLM
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForCausalLM.from_pretrained("google-bert/bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForCausalLM.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
TrueAutoModelForMaskedLM
This is a generic model class that will be instantiated as one of the model classes of the library (with a masked language modeling head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- AlbertConfig configuration class:
AlbertForMaskedLM(AlbertConfig model) - BartConfig configuration class: BartForConditionalGeneration (BartConfig model)
- BertConfig configuration class: BertForMaskedLM (BertConfig model)
- BigBirdConfig configuration class: BigBirdForMaskedLM (BigBirdConfig model)
- CamembertConfig configuration class: CamembertForMaskedLM (CamembertConfig model)
- ConvBertConfig configuration class: ConvBertForMaskedLM (ConvBertConfig model)
- Data2VecTextConfig configuration class: Data2VecTextForMaskedLM (Data2VecTextConfig model)
- DebertaConfig configuration class: DebertaForMaskedLM (DebertaConfig model)
- DebertaV2Config configuration class: DebertaV2ForMaskedLM (DebertaV2Config model)
- DistilBertConfig configuration class: DistilBertForMaskedLM (DistilBertConfig model)
- ElectraConfig configuration class: ElectraForMaskedLM (ElectraConfig model)
- ErnieConfig configuration class: ErnieForMaskedLM (ErnieConfig model)
- EsmConfig configuration class: EsmForMaskedLM (EsmConfig model)
- EuroBertConfig configuration class: EuroBertForMaskedLM (EuroBertConfig model)
- FNetConfig configuration class: FNetForMaskedLM (FNetConfig model)
- FlaubertConfig configuration class: FlaubertWithLMHeadModel (FlaubertConfig model)
- FunnelConfig configuration class: FunnelForMaskedLM (FunnelConfig model)
- IBertConfig configuration class: IBertForMaskedLM (IBertConfig model)
- JinaEmbeddingsV3Config configuration class: JinaEmbeddingsV3ForMaskedLM (JinaEmbeddingsV3Config model)
- LayoutLMConfig configuration class: LayoutLMForMaskedLM (LayoutLMConfig model)
- LongformerConfig configuration class: LongformerForMaskedLM (LongformerConfig model)
- LukeConfig configuration class: LukeForMaskedLM (LukeConfig model)
- MBartConfig configuration class: MBartForConditionalGeneration (MBartConfig model)
- MPNetConfig configuration class: MPNetForMaskedLM (MPNetConfig model)
- MegatronBertConfig configuration class: MegatronBertForMaskedLM (MegatronBertConfig model)
- MobileBertConfig configuration class: MobileBertForMaskedLM (MobileBertConfig model)
- ModernBertConfig configuration class: ModernBertForMaskedLM (ModernBertConfig model)
- ModernVBertConfig configuration class: ModernVBertForMaskedLM (ModernVBertConfig model)
- MraConfig configuration class: MraForMaskedLM (MraConfig model)
- MvpConfig configuration class: MvpForConditionalGeneration (MvpConfig model)
- NomicBertConfig configuration class: NomicBertForMaskedLM (NomicBertConfig model)
- NystromformerConfig configuration class: NystromformerForMaskedLM (NystromformerConfig model)
- PerceiverConfig configuration class: PerceiverForMaskedLM (PerceiverConfig model)
- ReformerConfig configuration class: ReformerForMaskedLM (ReformerConfig model)
- RemBertConfig configuration class: RemBertForMaskedLM (RemBertConfig model)
- RoCBertConfig configuration class: RoCBertForMaskedLM (RoCBertConfig model)
- RoFormerConfig configuration class: RoFormerForMaskedLM (RoFormerConfig model)
- RobertaConfig configuration class: RobertaForMaskedLM (RobertaConfig model)
- RobertaPreLayerNormConfig configuration class: RobertaPreLayerNormForMaskedLM (RobertaPreLayerNormConfig model)
- SqueezeBertConfig configuration class: SqueezeBertForMaskedLM (SqueezeBertConfig model)
- TapasConfig configuration class: TapasForMaskedLM (TapasConfig model)
- XLMConfig configuration class: XLMWithLMHeadModel (XLMConfig model)
- XLMRobertaConfig configuration class: XLMRobertaForMaskedLM (XLMRobertaConfig model)
- XLMRobertaXLConfig configuration class: XLMRobertaXLForMaskedLM (XLMRobertaXLConfig model)
- XmodConfig configuration class: XmodForMaskedLM (XmodConfig model)
- YosoConfig configuration class: YosoForMaskedLM (YosoConfig model)
- AlbertConfig configuration class:
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the model classes of the library (with a masked language modeling head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the model classes of the library (with a masked language modeling head) from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- albert —
AlbertForMaskedLM(AlbertConfig model) - bart — BartForConditionalGeneration (BartConfig model)
- bert — BertForMaskedLM (BertConfig model)
- big_bird — BigBirdForMaskedLM (BigBirdConfig model)
- camembert — CamembertForMaskedLM (CamembertConfig model)
- convbert — ConvBertForMaskedLM (ConvBertConfig model)
- data2vec-text — Data2VecTextForMaskedLM (Data2VecTextConfig model)
- deberta — DebertaForMaskedLM (DebertaConfig model)
- deberta-v2 — DebertaV2ForMaskedLM (DebertaV2Config model)
- distilbert — DistilBertForMaskedLM (DistilBertConfig model)
- electra — ElectraForMaskedLM (ElectraConfig model)
- ernie — ErnieForMaskedLM (ErnieConfig model)
- esm — EsmForMaskedLM (EsmConfig model)
- eurobert — EuroBertForMaskedLM (EuroBertConfig model)
- flaubert — FlaubertWithLMHeadModel (FlaubertConfig model)
- fnet — FNetForMaskedLM (FNetConfig model)
- funnel — FunnelForMaskedLM (FunnelConfig model)
- ibert — IBertForMaskedLM (IBertConfig model)
- jina_embeddings_v3 — JinaEmbeddingsV3ForMaskedLM (JinaEmbeddingsV3Config model)
- layoutlm — LayoutLMForMaskedLM (LayoutLMConfig model)
- longformer — LongformerForMaskedLM (LongformerConfig model)
- luke — LukeForMaskedLM (LukeConfig model)
- mbart — MBartForConditionalGeneration (MBartConfig model)
- megatron-bert — MegatronBertForMaskedLM (MegatronBertConfig model)
- mobilebert — MobileBertForMaskedLM (MobileBertConfig model)
- modernbert — ModernBertForMaskedLM (ModernBertConfig model)
- modernvbert — ModernVBertForMaskedLM (ModernVBertConfig model)
- mpnet — MPNetForMaskedLM (MPNetConfig model)
- mra — MraForMaskedLM (MraConfig model)
- mvp — MvpForConditionalGeneration (MvpConfig model)
- nomic_bert — NomicBertForMaskedLM (NomicBertConfig model)
- nystromformer — NystromformerForMaskedLM (NystromformerConfig model)
- perceiver — PerceiverForMaskedLM (PerceiverConfig model)
- reformer — ReformerForMaskedLM (ReformerConfig model)
- rembert — RemBertForMaskedLM (RemBertConfig model)
- roberta — RobertaForMaskedLM (RobertaConfig model)
- roberta-prelayernorm — RobertaPreLayerNormForMaskedLM (RobertaPreLayerNormConfig model)
- roc_bert — RoCBertForMaskedLM (RoCBertConfig model)
- roformer — RoFormerForMaskedLM (RoFormerConfig model)
- squeezebert — SqueezeBertForMaskedLM (SqueezeBertConfig model)
- tapas — TapasForMaskedLM (TapasConfig model)
- xlm — XLMWithLMHeadModel (XLMConfig model)
- xlm-roberta — XLMRobertaForMaskedLM (XLMRobertaConfig model)
- xlm-roberta-xl — XLMRobertaXLForMaskedLM (XLMRobertaXLConfig model)
- xmod — XmodForMaskedLM (XmodConfig model)
- yoso — YosoForMaskedLM (YosoConfig model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModelForMaskedLM
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForMaskedLM.from_pretrained("google-bert/bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForMaskedLM.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
TrueAutoModelForMaskGeneration
AutoModelForSeq2SeqLM
This is a generic model class that will be instantiated as one of the model classes of the library (with a sequence-to-sequence language modeling head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- AudioFlamingo3Config configuration class: AudioFlamingo3ForConditionalGeneration (AudioFlamingo3Config model)
- BartConfig configuration class: BartForConditionalGeneration (BartConfig model)
- BigBirdPegasusConfig configuration class: BigBirdPegasusForConditionalGeneration (BigBirdPegasusConfig model)
- BlenderbotConfig configuration class: BlenderbotForConditionalGeneration (BlenderbotConfig model)
- BlenderbotSmallConfig configuration class: BlenderbotSmallForConditionalGeneration (BlenderbotSmallConfig model)
- EncoderDecoderConfig configuration class: EncoderDecoderModel (EncoderDecoderConfig model)
- FSMTConfig configuration class: FSMTForConditionalGeneration (FSMTConfig model)
- GlmAsrConfig configuration class: GlmAsrForConditionalGeneration (GlmAsrConfig model)
- GraniteSpeechConfig configuration class: GraniteSpeechForConditionalGeneration (GraniteSpeechConfig model)
- LEDConfig configuration class: LEDForConditionalGeneration (LEDConfig model)
- LongT5Config configuration class: LongT5ForConditionalGeneration (LongT5Config model)
- M2M100Config configuration class: M2M100ForConditionalGeneration (M2M100Config model)
- MBartConfig configuration class: MBartForConditionalGeneration (MBartConfig model)
- MT5Config configuration class: MT5ForConditionalGeneration (MT5Config model)
- MarianConfig configuration class: MarianMTModel (MarianConfig model)
- MusicFlamingoConfig configuration class: MusicFlamingoForConditionalGeneration (MusicFlamingoConfig model)
- MvpConfig configuration class: MvpForConditionalGeneration (MvpConfig model)
- NllbMoeConfig configuration class: NllbMoeForConditionalGeneration (NllbMoeConfig model)
- PLBartConfig configuration class: PLBartForConditionalGeneration (PLBartConfig model)
- PegasusConfig configuration class: PegasusForConditionalGeneration (PegasusConfig model)
- PegasusXConfig configuration class: PegasusXForConditionalGeneration (PegasusXConfig model)
- ProphetNetConfig configuration class: ProphetNetForConditionalGeneration (ProphetNetConfig model)
- Qwen2AudioConfig configuration class: Qwen2AudioForConditionalGeneration (Qwen2AudioConfig model)
- SeamlessM4TConfig configuration class: SeamlessM4TForTextToText (SeamlessM4TConfig model)
- SeamlessM4Tv2Config configuration class: SeamlessM4Tv2ForTextToText (SeamlessM4Tv2Config model)
- SwitchTransformersConfig configuration class: SwitchTransformersForConditionalGeneration (SwitchTransformersConfig model)
- T5Config configuration class: T5ForConditionalGeneration (T5Config model)
- T5Gemma2Config configuration class: T5Gemma2ForConditionalGeneration (T5Gemma2Config model)
- T5GemmaConfig configuration class: T5GemmaForConditionalGeneration (T5GemmaConfig model)
- UMT5Config configuration class: UMT5ForConditionalGeneration (UMT5Config model)
- VibeVoiceAsrConfig configuration class: VibeVoiceAsrForConditionalGeneration (VibeVoiceAsrConfig model)
- VoxtralConfig configuration class: VoxtralForConditionalGeneration (VoxtralConfig model)
- VoxtralRealtimeConfig configuration class: VoxtralRealtimeForConditionalGeneration (VoxtralRealtimeConfig model)
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the model classes of the library (with a sequence-to-sequence language modeling head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the model classes of the library (with a sequence-to-sequence language modeling head) from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- audioflamingo3 — AudioFlamingo3ForConditionalGeneration (AudioFlamingo3Config model)
- bart — BartForConditionalGeneration (BartConfig model)
- bigbird_pegasus — BigBirdPegasusForConditionalGeneration (BigBirdPegasusConfig model)
- blenderbot — BlenderbotForConditionalGeneration (BlenderbotConfig model)
- blenderbot-small — BlenderbotSmallForConditionalGeneration (BlenderbotSmallConfig model)
- encoder-decoder — EncoderDecoderModel (EncoderDecoderConfig model)
- fsmt — FSMTForConditionalGeneration (FSMTConfig model)
- glmasr — GlmAsrForConditionalGeneration (GlmAsrConfig model)
- granite_speech — GraniteSpeechForConditionalGeneration (GraniteSpeechConfig model)
- led — LEDForConditionalGeneration (LEDConfig model)
- longt5 — LongT5ForConditionalGeneration (LongT5Config model)
- m2m_100 — M2M100ForConditionalGeneration (M2M100Config model)
- marian — MarianMTModel (MarianConfig model)
- mbart — MBartForConditionalGeneration (MBartConfig model)
- mt5 — MT5ForConditionalGeneration (MT5Config model)
- musicflamingo — MusicFlamingoForConditionalGeneration (MusicFlamingoConfig model)
- mvp — MvpForConditionalGeneration (MvpConfig model)
- nllb-moe — NllbMoeForConditionalGeneration (NllbMoeConfig model)
- pegasus — PegasusForConditionalGeneration (PegasusConfig model)
- pegasus_x — PegasusXForConditionalGeneration (PegasusXConfig model)
- plbart — PLBartForConditionalGeneration (PLBartConfig model)
- prophetnet — ProphetNetForConditionalGeneration (ProphetNetConfig model)
- qwen2_audio — Qwen2AudioForConditionalGeneration (Qwen2AudioConfig model)
- seamless_m4t — SeamlessM4TForTextToText (SeamlessM4TConfig model)
- seamless_m4t_v2 — SeamlessM4Tv2ForTextToText (SeamlessM4Tv2Config model)
- switch_transformers — SwitchTransformersForConditionalGeneration (SwitchTransformersConfig model)
- t5 — T5ForConditionalGeneration (T5Config model)
- t5gemma — T5GemmaForConditionalGeneration (T5GemmaConfig model)
- t5gemma2 — T5Gemma2ForConditionalGeneration (T5Gemma2Config model)
- umt5 — UMT5ForConditionalGeneration (UMT5Config model)
- vibevoice_asr — VibeVoiceAsrForConditionalGeneration (VibeVoiceAsrConfig model)
- voxtral — VoxtralForConditionalGeneration (VoxtralConfig model)
- voxtral_realtime — VoxtralRealtimeForConditionalGeneration (VoxtralRealtimeConfig model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModelForSeq2SeqLM
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForSeq2SeqLM.from_pretrained("google-t5/t5-base")
>>> # Update configuration during loading
>>> model = AutoModelForSeq2SeqLM.from_pretrained("google-t5/t5-base", output_attentions=True)
>>> model.config.output_attentions
TrueAutoModelForSequenceClassification
This is a generic model class that will be instantiated as one of the model classes of the library (with a sequence classification head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- AlbertConfig configuration class:
AlbertForSequenceClassification(AlbertConfig model) - ArceeConfig configuration class: ArceeForSequenceClassification (ArceeConfig model)
- BartConfig configuration class: BartForSequenceClassification (BartConfig model)
- BertConfig configuration class: BertForSequenceClassification (BertConfig model)
- BigBirdConfig configuration class: BigBirdForSequenceClassification (BigBirdConfig model)
- BigBirdPegasusConfig configuration class: BigBirdPegasusForSequenceClassification (BigBirdPegasusConfig model)
- BioGptConfig configuration class: BioGptForSequenceClassification (BioGptConfig model)
- BloomConfig configuration class: BloomForSequenceClassification (BloomConfig model)
- CTRLConfig configuration class: CTRLForSequenceClassification (CTRLConfig model)
- CamembertConfig configuration class: CamembertForSequenceClassification (CamembertConfig model)
- CanineConfig configuration class: CanineForSequenceClassification (CanineConfig model)
- ConvBertConfig configuration class: ConvBertForSequenceClassification (ConvBertConfig model)
- Data2VecTextConfig configuration class: Data2VecTextForSequenceClassification (Data2VecTextConfig model)
- DebertaConfig configuration class: DebertaForSequenceClassification (DebertaConfig model)
- DebertaV2Config configuration class: DebertaV2ForSequenceClassification (DebertaV2Config model)
- DeepseekV2Config configuration class: DeepseekV2ForSequenceClassification (DeepseekV2Config model)
- DeepseekV3Config configuration class: DeepseekV3ForSequenceClassification (DeepseekV3Config model)
- DiffLlamaConfig configuration class: DiffLlamaForSequenceClassification (DiffLlamaConfig model)
- DistilBertConfig configuration class: DistilBertForSequenceClassification (DistilBertConfig model)
- DogeConfig configuration class: DogeForSequenceClassification (DogeConfig model)
- ElectraConfig configuration class: ElectraForSequenceClassification (ElectraConfig model)
- ErnieConfig configuration class: ErnieForSequenceClassification (ErnieConfig model)
- EsmConfig configuration class: EsmForSequenceClassification (EsmConfig model)
- EuroBertConfig configuration class: EuroBertForSequenceClassification (EuroBertConfig model)
- Exaone4Config configuration class: Exaone4ForSequenceClassification (Exaone4Config model)
- FNetConfig configuration class: FNetForSequenceClassification (FNetConfig model)
- FalconConfig configuration class: FalconForSequenceClassification (FalconConfig model)
- FlaubertConfig configuration class: FlaubertForSequenceClassification (FlaubertConfig model)
- FunnelConfig configuration class: FunnelForSequenceClassification (FunnelConfig model)
- GPT2Config configuration class: GPT2ForSequenceClassification (GPT2Config model)
- GPTBigCodeConfig configuration class: GPTBigCodeForSequenceClassification (GPTBigCodeConfig model)
- GPTJConfig configuration class: GPTJForSequenceClassification (GPTJConfig model)
- GPTNeoConfig configuration class: GPTNeoForSequenceClassification (GPTNeoConfig model)
- GPTNeoXConfig configuration class: GPTNeoXForSequenceClassification (GPTNeoXConfig model)
- Gemma2Config configuration class: Gemma2ForSequenceClassification (Gemma2Config model)
- Gemma3Config configuration class: Gemma3ForSequenceClassification (Gemma3Config model)
- Gemma3TextConfig configuration class: Gemma3TextForSequenceClassification (Gemma3TextConfig model)
- GemmaConfig configuration class: GemmaForSequenceClassification (GemmaConfig model)
- Glm4Config configuration class: Glm4ForSequenceClassification (Glm4Config model)
- GlmConfig configuration class: GlmForSequenceClassification (GlmConfig model)
- GptOssConfig configuration class: GptOssForSequenceClassification (GptOssConfig model)
- HeliumConfig configuration class: HeliumForSequenceClassification (HeliumConfig model)
- HunYuanDenseV1Config configuration class: HunYuanDenseV1ForSequenceClassification (HunYuanDenseV1Config model)
- HunYuanMoEV1Config configuration class: HunYuanMoEV1ForSequenceClassification (HunYuanMoEV1Config model)
- IBertConfig configuration class: IBertForSequenceClassification (IBertConfig model)
- JambaConfig configuration class: JambaForSequenceClassification (JambaConfig model)
- JetMoeConfig configuration class: JetMoeForSequenceClassification (JetMoeConfig model)
- JinaEmbeddingsV3Config configuration class: JinaEmbeddingsV3ForSequenceClassification (JinaEmbeddingsV3Config model)
- LayoutLMConfig configuration class: LayoutLMForSequenceClassification (LayoutLMConfig model)
- LayoutLMv2Config configuration class: LayoutLMv2ForSequenceClassification (LayoutLMv2Config model)
- LayoutLMv3Config configuration class: LayoutLMv3ForSequenceClassification (LayoutLMv3Config model)
- LiltConfig configuration class: LiltForSequenceClassification (LiltConfig model)
- LlamaConfig configuration class: LlamaForSequenceClassification (LlamaConfig model)
- LongformerConfig configuration class: LongformerForSequenceClassification (LongformerConfig model)
- LukeConfig configuration class: LukeForSequenceClassification (LukeConfig model)
- MBartConfig configuration class: MBartForSequenceClassification (MBartConfig model)
- MPNetConfig configuration class: MPNetForSequenceClassification (MPNetConfig model)
- MT5Config configuration class: MT5ForSequenceClassification (MT5Config model)
- MarkupLMConfig configuration class: MarkupLMForSequenceClassification (MarkupLMConfig model)
- MegatronBertConfig configuration class: MegatronBertForSequenceClassification (MegatronBertConfig model)
- MiniMaxConfig configuration class: MiniMaxForSequenceClassification (MiniMaxConfig model)
- Ministral3Config configuration class: Ministral3ForSequenceClassification (Ministral3Config model)
- MinistralConfig configuration class: MinistralForSequenceClassification (MinistralConfig model)
- Mistral4Config configuration class: Mistral4ForSequenceClassification (Mistral4Config model)
- MistralConfig configuration class: MistralForSequenceClassification (MistralConfig model)
- MixtralConfig configuration class: MixtralForSequenceClassification (MixtralConfig model)
- MobileBertConfig configuration class: MobileBertForSequenceClassification (MobileBertConfig model)
- ModernBertConfig configuration class: ModernBertForSequenceClassification (ModernBertConfig model)
- ModernBertDecoderConfig configuration class: ModernBertDecoderForSequenceClassification (ModernBertDecoderConfig model)
- ModernVBertConfig configuration class: ModernVBertForSequenceClassification (ModernVBertConfig model)
- MptConfig configuration class: MptForSequenceClassification (MptConfig model)
- MraConfig configuration class: MraForSequenceClassification (MraConfig model)
- MvpConfig configuration class: MvpForSequenceClassification (MvpConfig model)
- NemotronConfig configuration class: NemotronForSequenceClassification (NemotronConfig model)
- NomicBertConfig configuration class: NomicBertForSequenceClassification (NomicBertConfig model)
- NystromformerConfig configuration class: NystromformerForSequenceClassification (NystromformerConfig model)
- OPTConfig configuration class: OPTForSequenceClassification (OPTConfig model)
- OpenAIGPTConfig configuration class: OpenAIGPTForSequenceClassification (OpenAIGPTConfig model)
- PLBartConfig configuration class: PLBartForSequenceClassification (PLBartConfig model)
- PerceiverConfig configuration class: PerceiverForSequenceClassification (PerceiverConfig model)
- PersimmonConfig configuration class: PersimmonForSequenceClassification (PersimmonConfig model)
- Phi3Config configuration class: Phi3ForSequenceClassification (Phi3Config model)
- PhiConfig configuration class: PhiForSequenceClassification (PhiConfig model)
- PhimoeConfig configuration class: PhimoeForSequenceClassification (PhimoeConfig model)
- Qwen2Config configuration class: Qwen2ForSequenceClassification (Qwen2Config model)
- Qwen2MoeConfig configuration class: Qwen2MoeForSequenceClassification (Qwen2MoeConfig model)
- Qwen3Config configuration class: Qwen3ForSequenceClassification (Qwen3Config model)
- Qwen3MoeConfig configuration class: Qwen3MoeForSequenceClassification (Qwen3MoeConfig model)
- Qwen3NextConfig configuration class: Qwen3NextForSequenceClassification (Qwen3NextConfig model)
- Qwen3_5Config configuration class: Qwen3_5ForSequenceClassification (Qwen3_5Config model)
- Qwen3_5TextConfig configuration class: Qwen3_5ForSequenceClassification (Qwen3_5TextConfig model)
- ReformerConfig configuration class: ReformerForSequenceClassification (ReformerConfig model)
- RemBertConfig configuration class: RemBertForSequenceClassification (RemBertConfig model)
- RoCBertConfig configuration class: RoCBertForSequenceClassification (RoCBertConfig model)
- RoFormerConfig configuration class: RoFormerForSequenceClassification (RoFormerConfig model)
- RobertaConfig configuration class: RobertaForSequenceClassification (RobertaConfig model)
- RobertaPreLayerNormConfig configuration class: RobertaPreLayerNormForSequenceClassification (RobertaPreLayerNormConfig model)
- SeedOssConfig configuration class: SeedOssForSequenceClassification (SeedOssConfig model)
- SmolLM3Config configuration class: SmolLM3ForSequenceClassification (SmolLM3Config model)
- SqueezeBertConfig configuration class: SqueezeBertForSequenceClassification (SqueezeBertConfig model)
- StableLmConfig configuration class: StableLmForSequenceClassification (StableLmConfig model)
- Starcoder2Config configuration class: Starcoder2ForSequenceClassification (Starcoder2Config model)
- T5Config configuration class: T5ForSequenceClassification (T5Config model)
- T5Gemma2Config configuration class: T5Gemma2ForSequenceClassification (T5Gemma2Config model)
- T5GemmaConfig configuration class: T5GemmaForSequenceClassification (T5GemmaConfig model)
- TapasConfig configuration class: TapasForSequenceClassification (TapasConfig model)
- UMT5Config configuration class: UMT5ForSequenceClassification (UMT5Config model)
- XLMConfig configuration class: XLMForSequenceClassification (XLMConfig model)
- XLMRobertaConfig configuration class: XLMRobertaForSequenceClassification (XLMRobertaConfig model)
- XLMRobertaXLConfig configuration class: XLMRobertaXLForSequenceClassification (XLMRobertaXLConfig model)
- XLNetConfig configuration class: XLNetForSequenceClassification (XLNetConfig model)
- XmodConfig configuration class: XmodForSequenceClassification (XmodConfig model)
- YosoConfig configuration class: YosoForSequenceClassification (YosoConfig model)
- Zamba2Config configuration class: Zamba2ForSequenceClassification (Zamba2Config model)
- ZambaConfig configuration class: ZambaForSequenceClassification (ZambaConfig model)
- AlbertConfig configuration class:
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the model classes of the library (with a sequence classification head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the model classes of the library (with a sequence classification head) from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- albert —
AlbertForSequenceClassification(AlbertConfig model) - arcee — ArceeForSequenceClassification (ArceeConfig model)
- bart — BartForSequenceClassification (BartConfig model)
- bert — BertForSequenceClassification (BertConfig model)
- big_bird — BigBirdForSequenceClassification (BigBirdConfig model)
- bigbird_pegasus — BigBirdPegasusForSequenceClassification (BigBirdPegasusConfig model)
- biogpt — BioGptForSequenceClassification (BioGptConfig model)
- bloom — BloomForSequenceClassification (BloomConfig model)
- camembert — CamembertForSequenceClassification (CamembertConfig model)
- canine — CanineForSequenceClassification (CanineConfig model)
- convbert — ConvBertForSequenceClassification (ConvBertConfig model)
- ctrl — CTRLForSequenceClassification (CTRLConfig model)
- data2vec-text — Data2VecTextForSequenceClassification (Data2VecTextConfig model)
- deberta — DebertaForSequenceClassification (DebertaConfig model)
- deberta-v2 — DebertaV2ForSequenceClassification (DebertaV2Config model)
- deepseek_v2 — DeepseekV2ForSequenceClassification (DeepseekV2Config model)
- deepseek_v3 — DeepseekV3ForSequenceClassification (DeepseekV3Config model)
- diffllama — DiffLlamaForSequenceClassification (DiffLlamaConfig model)
- distilbert — DistilBertForSequenceClassification (DistilBertConfig model)
- doge — DogeForSequenceClassification (DogeConfig model)
- electra — ElectraForSequenceClassification (ElectraConfig model)
- ernie — ErnieForSequenceClassification (ErnieConfig model)
- esm — EsmForSequenceClassification (EsmConfig model)
- eurobert — EuroBertForSequenceClassification (EuroBertConfig model)
- exaone4 — Exaone4ForSequenceClassification (Exaone4Config model)
- falcon — FalconForSequenceClassification (FalconConfig model)
- flaubert — FlaubertForSequenceClassification (FlaubertConfig model)
- fnet — FNetForSequenceClassification (FNetConfig model)
- funnel — FunnelForSequenceClassification (FunnelConfig model)
- gemma — GemmaForSequenceClassification (GemmaConfig model)
- gemma2 — Gemma2ForSequenceClassification (Gemma2Config model)
- gemma3 — Gemma3ForSequenceClassification (Gemma3Config model)
- gemma3_text — Gemma3TextForSequenceClassification (Gemma3TextConfig model)
- glm — GlmForSequenceClassification (GlmConfig model)
- glm4 — Glm4ForSequenceClassification (Glm4Config model)
- gpt-sw3 — GPT2ForSequenceClassification (GPT2Config model)
- gpt2 — GPT2ForSequenceClassification (GPT2Config model)
- gpt_bigcode — GPTBigCodeForSequenceClassification (GPTBigCodeConfig model)
- gpt_neo — GPTNeoForSequenceClassification (GPTNeoConfig model)
- gpt_neox — GPTNeoXForSequenceClassification (GPTNeoXConfig model)
- gpt_oss — GptOssForSequenceClassification (GptOssConfig model)
- gptj — GPTJForSequenceClassification (GPTJConfig model)
- helium — HeliumForSequenceClassification (HeliumConfig model)
- hunyuan_v1_dense — HunYuanDenseV1ForSequenceClassification (HunYuanDenseV1Config model)
- hunyuan_v1_moe — HunYuanMoEV1ForSequenceClassification (HunYuanMoEV1Config model)
- ibert — IBertForSequenceClassification (IBertConfig model)
- jamba — JambaForSequenceClassification (JambaConfig model)
- jetmoe — JetMoeForSequenceClassification (JetMoeConfig model)
- jina_embeddings_v3 — JinaEmbeddingsV3ForSequenceClassification (JinaEmbeddingsV3Config model)
- layoutlm — LayoutLMForSequenceClassification (LayoutLMConfig model)
- layoutlmv2 — LayoutLMv2ForSequenceClassification (LayoutLMv2Config model)
- layoutlmv3 — LayoutLMv3ForSequenceClassification (LayoutLMv3Config model)
- lilt — LiltForSequenceClassification (LiltConfig model)
- llama — LlamaForSequenceClassification (LlamaConfig model)
- longformer — LongformerForSequenceClassification (LongformerConfig model)
- luke — LukeForSequenceClassification (LukeConfig model)
- markuplm — MarkupLMForSequenceClassification (MarkupLMConfig model)
- mbart — MBartForSequenceClassification (MBartConfig model)
- megatron-bert — MegatronBertForSequenceClassification (MegatronBertConfig model)
- minimax — MiniMaxForSequenceClassification (MiniMaxConfig model)
- ministral — MinistralForSequenceClassification (MinistralConfig model)
- ministral3 — Ministral3ForSequenceClassification (Ministral3Config model)
- mistral — MistralForSequenceClassification (MistralConfig model)
- mistral4 — Mistral4ForSequenceClassification (Mistral4Config model)
- mixtral — MixtralForSequenceClassification (MixtralConfig model)
- mobilebert — MobileBertForSequenceClassification (MobileBertConfig model)
- modernbert — ModernBertForSequenceClassification (ModernBertConfig model)
- modernbert-decoder — ModernBertDecoderForSequenceClassification (ModernBertDecoderConfig model)
- modernvbert — ModernVBertForSequenceClassification (ModernVBertConfig model)
- mpnet — MPNetForSequenceClassification (MPNetConfig model)
- mpt — MptForSequenceClassification (MptConfig model)
- mra — MraForSequenceClassification (MraConfig model)
- mt5 — MT5ForSequenceClassification (MT5Config model)
- mvp — MvpForSequenceClassification (MvpConfig model)
- nemotron — NemotronForSequenceClassification (NemotronConfig model)
- nomic_bert — NomicBertForSequenceClassification (NomicBertConfig model)
- nystromformer — NystromformerForSequenceClassification (NystromformerConfig model)
- openai-gpt — OpenAIGPTForSequenceClassification (OpenAIGPTConfig model)
- opt — OPTForSequenceClassification (OPTConfig model)
- perceiver — PerceiverForSequenceClassification (PerceiverConfig model)
- persimmon — PersimmonForSequenceClassification (PersimmonConfig model)
- phi — PhiForSequenceClassification (PhiConfig model)
- phi3 — Phi3ForSequenceClassification (Phi3Config model)
- phimoe — PhimoeForSequenceClassification (PhimoeConfig model)
- plbart — PLBartForSequenceClassification (PLBartConfig model)
- qwen2 — Qwen2ForSequenceClassification (Qwen2Config model)
- qwen2_moe — Qwen2MoeForSequenceClassification (Qwen2MoeConfig model)
- qwen3 — Qwen3ForSequenceClassification (Qwen3Config model)
- qwen3_5 — Qwen3_5ForSequenceClassification (Qwen3_5Config model)
- qwen3_5_text — Qwen3_5ForSequenceClassification (Qwen3_5TextConfig model)
- qwen3_moe — Qwen3MoeForSequenceClassification (Qwen3MoeConfig model)
- qwen3_next — Qwen3NextForSequenceClassification (Qwen3NextConfig model)
- reformer — ReformerForSequenceClassification (ReformerConfig model)
- rembert — RemBertForSequenceClassification (RemBertConfig model)
- roberta — RobertaForSequenceClassification (RobertaConfig model)
- roberta-prelayernorm — RobertaPreLayerNormForSequenceClassification (RobertaPreLayerNormConfig model)
- roc_bert — RoCBertForSequenceClassification (RoCBertConfig model)
- roformer — RoFormerForSequenceClassification (RoFormerConfig model)
- seed_oss — SeedOssForSequenceClassification (SeedOssConfig model)
- smollm3 — SmolLM3ForSequenceClassification (SmolLM3Config model)
- squeezebert — SqueezeBertForSequenceClassification (SqueezeBertConfig model)
- stablelm — StableLmForSequenceClassification (StableLmConfig model)
- starcoder2 — Starcoder2ForSequenceClassification (Starcoder2Config model)
- t5 — T5ForSequenceClassification (T5Config model)
- t5gemma — T5GemmaForSequenceClassification (T5GemmaConfig model)
- t5gemma2 — T5Gemma2ForSequenceClassification (T5Gemma2Config model)
- tapas — TapasForSequenceClassification (TapasConfig model)
- umt5 — UMT5ForSequenceClassification (UMT5Config model)
- xlm — XLMForSequenceClassification (XLMConfig model)
- xlm-roberta — XLMRobertaForSequenceClassification (XLMRobertaConfig model)
- xlm-roberta-xl — XLMRobertaXLForSequenceClassification (XLMRobertaXLConfig model)
- xlnet — XLNetForSequenceClassification (XLNetConfig model)
- xmod — XmodForSequenceClassification (XmodConfig model)
- yoso — YosoForSequenceClassification (YosoConfig model)
- zamba — ZambaForSequenceClassification (ZambaConfig model)
- zamba2 — Zamba2ForSequenceClassification (Zamba2Config model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModelForSequenceClassification
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForSequenceClassification.from_pretrained("google-bert/bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForSequenceClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
TrueAutoModelForMultipleChoice
This is a generic model class that will be instantiated as one of the model classes of the library (with a multiple choice head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- AlbertConfig configuration class: AlbertForMultipleChoice (AlbertConfig model)
- BertConfig configuration class: BertForMultipleChoice (BertConfig model)
- BigBirdConfig configuration class: BigBirdForMultipleChoice (BigBirdConfig model)
- CamembertConfig configuration class: CamembertForMultipleChoice (CamembertConfig model)
- CanineConfig configuration class: CanineForMultipleChoice (CanineConfig model)
- ConvBertConfig configuration class: ConvBertForMultipleChoice (ConvBertConfig model)
- Data2VecTextConfig configuration class: Data2VecTextForMultipleChoice (Data2VecTextConfig model)
- DebertaV2Config configuration class: DebertaV2ForMultipleChoice (DebertaV2Config model)
- DistilBertConfig configuration class: DistilBertForMultipleChoice (DistilBertConfig model)
- ElectraConfig configuration class: ElectraForMultipleChoice (ElectraConfig model)
- ErnieConfig configuration class: ErnieForMultipleChoice (ErnieConfig model)
- FNetConfig configuration class: FNetForMultipleChoice (FNetConfig model)
- FlaubertConfig configuration class: FlaubertForMultipleChoice (FlaubertConfig model)
- FunnelConfig configuration class: FunnelForMultipleChoice (FunnelConfig model)
- IBertConfig configuration class: IBertForMultipleChoice (IBertConfig model)
- LongformerConfig configuration class: LongformerForMultipleChoice (LongformerConfig model)
- LukeConfig configuration class: LukeForMultipleChoice (LukeConfig model)
- MPNetConfig configuration class: MPNetForMultipleChoice (MPNetConfig model)
- MegatronBertConfig configuration class: MegatronBertForMultipleChoice (MegatronBertConfig model)
- MobileBertConfig configuration class: MobileBertForMultipleChoice (MobileBertConfig model)
- ModernBertConfig configuration class: ModernBertForMultipleChoice (ModernBertConfig model)
- MraConfig configuration class: MraForMultipleChoice (MraConfig model)
- NystromformerConfig configuration class: NystromformerForMultipleChoice (NystromformerConfig model)
- RemBertConfig configuration class: RemBertForMultipleChoice (RemBertConfig model)
- RoCBertConfig configuration class: RoCBertForMultipleChoice (RoCBertConfig model)
- RoFormerConfig configuration class: RoFormerForMultipleChoice (RoFormerConfig model)
- RobertaConfig configuration class: RobertaForMultipleChoice (RobertaConfig model)
- RobertaPreLayerNormConfig configuration class: RobertaPreLayerNormForMultipleChoice (RobertaPreLayerNormConfig model)
- SqueezeBertConfig configuration class: SqueezeBertForMultipleChoice (SqueezeBertConfig model)
- XLMConfig configuration class: XLMForMultipleChoice (XLMConfig model)
- XLMRobertaConfig configuration class: XLMRobertaForMultipleChoice (XLMRobertaConfig model)
- XLMRobertaXLConfig configuration class: XLMRobertaXLForMultipleChoice (XLMRobertaXLConfig model)
- XLNetConfig configuration class: XLNetForMultipleChoice (XLNetConfig model)
- XmodConfig configuration class: XmodForMultipleChoice (XmodConfig model)
- YosoConfig configuration class: YosoForMultipleChoice (YosoConfig model)
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the model classes of the library (with a multiple choice head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the model classes of the library (with a multiple choice head) from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- albert — AlbertForMultipleChoice (AlbertConfig model)
- bert — BertForMultipleChoice (BertConfig model)
- big_bird — BigBirdForMultipleChoice (BigBirdConfig model)
- camembert — CamembertForMultipleChoice (CamembertConfig model)
- canine — CanineForMultipleChoice (CanineConfig model)
- convbert — ConvBertForMultipleChoice (ConvBertConfig model)
- data2vec-text — Data2VecTextForMultipleChoice (Data2VecTextConfig model)
- deberta-v2 — DebertaV2ForMultipleChoice (DebertaV2Config model)
- distilbert — DistilBertForMultipleChoice (DistilBertConfig model)
- electra — ElectraForMultipleChoice (ElectraConfig model)
- ernie — ErnieForMultipleChoice (ErnieConfig model)
- flaubert — FlaubertForMultipleChoice (FlaubertConfig model)
- fnet — FNetForMultipleChoice (FNetConfig model)
- funnel — FunnelForMultipleChoice (FunnelConfig model)
- ibert — IBertForMultipleChoice (IBertConfig model)
- longformer — LongformerForMultipleChoice (LongformerConfig model)
- luke — LukeForMultipleChoice (LukeConfig model)
- megatron-bert — MegatronBertForMultipleChoice (MegatronBertConfig model)
- mobilebert — MobileBertForMultipleChoice (MobileBertConfig model)
- modernbert — ModernBertForMultipleChoice (ModernBertConfig model)
- mpnet — MPNetForMultipleChoice (MPNetConfig model)
- mra — MraForMultipleChoice (MraConfig model)
- nystromformer — NystromformerForMultipleChoice (NystromformerConfig model)
- rembert — RemBertForMultipleChoice (RemBertConfig model)
- roberta — RobertaForMultipleChoice (RobertaConfig model)
- roberta-prelayernorm — RobertaPreLayerNormForMultipleChoice (RobertaPreLayerNormConfig model)
- roc_bert — RoCBertForMultipleChoice (RoCBertConfig model)
- roformer — RoFormerForMultipleChoice (RoFormerConfig model)
- squeezebert — SqueezeBertForMultipleChoice (SqueezeBertConfig model)
- xlm — XLMForMultipleChoice (XLMConfig model)
- xlm-roberta — XLMRobertaForMultipleChoice (XLMRobertaConfig model)
- xlm-roberta-xl — XLMRobertaXLForMultipleChoice (XLMRobertaXLConfig model)
- xlnet — XLNetForMultipleChoice (XLNetConfig model)
- xmod — XmodForMultipleChoice (XmodConfig model)
- yoso — YosoForMultipleChoice (YosoConfig model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModelForMultipleChoice
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForMultipleChoice.from_pretrained("google-bert/bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForMultipleChoice.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
TrueAutoModelForNextSentencePrediction
This is a generic model class that will be instantiated as one of the model classes of the library (with a next sentence prediction head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- BertConfig configuration class: BertForNextSentencePrediction (BertConfig model)
- ErnieConfig configuration class: ErnieForNextSentencePrediction (ErnieConfig model)
- FNetConfig configuration class: FNetForNextSentencePrediction (FNetConfig model)
- MegatronBertConfig configuration class: MegatronBertForNextSentencePrediction (MegatronBertConfig model)
- MobileBertConfig configuration class: MobileBertForNextSentencePrediction (MobileBertConfig model)
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the model classes of the library (with a next sentence prediction head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the model classes of the library (with a next sentence prediction head) from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- bert — BertForNextSentencePrediction (BertConfig model)
- ernie — ErnieForNextSentencePrediction (ErnieConfig model)
- fnet — FNetForNextSentencePrediction (FNetConfig model)
- megatron-bert — MegatronBertForNextSentencePrediction (MegatronBertConfig model)
- mobilebert — MobileBertForNextSentencePrediction (MobileBertConfig model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModelForNextSentencePrediction
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForNextSentencePrediction.from_pretrained("google-bert/bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForNextSentencePrediction.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
TrueAutoModelForTokenClassification
This is a generic model class that will be instantiated as one of the model classes of the library (with a token classification head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- AlbertConfig configuration class:
AlbertForTokenClassification(AlbertConfig model) - ApertusConfig configuration class: ApertusForTokenClassification (ApertusConfig model)
- ArceeConfig configuration class: ArceeForTokenClassification (ArceeConfig model)
- BertConfig configuration class: BertForTokenClassification (BertConfig model)
- BigBirdConfig configuration class: BigBirdForTokenClassification (BigBirdConfig model)
- BioGptConfig configuration class: BioGptForTokenClassification (BioGptConfig model)
- BloomConfig configuration class: BloomForTokenClassification (BloomConfig model)
- BrosConfig configuration class: BrosForTokenClassification (BrosConfig model)
- CamembertConfig configuration class: CamembertForTokenClassification (CamembertConfig model)
- CanineConfig configuration class: CanineForTokenClassification (CanineConfig model)
- ConvBertConfig configuration class: ConvBertForTokenClassification (ConvBertConfig model)
- Data2VecTextConfig configuration class: Data2VecTextForTokenClassification (Data2VecTextConfig model)
- DebertaConfig configuration class: DebertaForTokenClassification (DebertaConfig model)
- DebertaV2Config configuration class: DebertaV2ForTokenClassification (DebertaV2Config model)
- DeepseekV3Config configuration class: DeepseekV3ForTokenClassification (DeepseekV3Config model)
- DiffLlamaConfig configuration class: DiffLlamaForTokenClassification (DiffLlamaConfig model)
- DistilBertConfig configuration class: DistilBertForTokenClassification (DistilBertConfig model)
- ElectraConfig configuration class: ElectraForTokenClassification (ElectraConfig model)
- ErnieConfig configuration class: ErnieForTokenClassification (ErnieConfig model)
- EsmConfig configuration class: EsmForTokenClassification (EsmConfig model)
- EuroBertConfig configuration class: EuroBertForTokenClassification (EuroBertConfig model)
- Exaone4Config configuration class: Exaone4ForTokenClassification (Exaone4Config model)
- FNetConfig configuration class: FNetForTokenClassification (FNetConfig model)
- FalconConfig configuration class: FalconForTokenClassification (FalconConfig model)
- FlaubertConfig configuration class: FlaubertForTokenClassification (FlaubertConfig model)
- FunnelConfig configuration class: FunnelForTokenClassification (FunnelConfig model)
- GPT2Config configuration class: GPT2ForTokenClassification (GPT2Config model)
- GPTBigCodeConfig configuration class: GPTBigCodeForTokenClassification (GPTBigCodeConfig model)
- GPTNeoConfig configuration class: GPTNeoForTokenClassification (GPTNeoConfig model)
- GPTNeoXConfig configuration class: GPTNeoXForTokenClassification (GPTNeoXConfig model)
- Gemma2Config configuration class: Gemma2ForTokenClassification (Gemma2Config model)
- GemmaConfig configuration class: GemmaForTokenClassification (GemmaConfig model)
- Glm4Config configuration class: Glm4ForTokenClassification (Glm4Config model)
- GlmConfig configuration class: GlmForTokenClassification (GlmConfig model)
- GptOssConfig configuration class: GptOssForTokenClassification (GptOssConfig model)
- HeliumConfig configuration class: HeliumForTokenClassification (HeliumConfig model)
- IBertConfig configuration class: IBertForTokenClassification (IBertConfig model)
- JinaEmbeddingsV3Config configuration class: JinaEmbeddingsV3ForTokenClassification (JinaEmbeddingsV3Config model)
- LayoutLMConfig configuration class: LayoutLMForTokenClassification (LayoutLMConfig model)
- LayoutLMv2Config configuration class: LayoutLMv2ForTokenClassification (LayoutLMv2Config model)
- LayoutLMv3Config configuration class: LayoutLMv3ForTokenClassification (LayoutLMv3Config model)
- LiltConfig configuration class: LiltForTokenClassification (LiltConfig model)
- LlamaConfig configuration class: LlamaForTokenClassification (LlamaConfig model)
- LongformerConfig configuration class: LongformerForTokenClassification (LongformerConfig model)
- LukeConfig configuration class: LukeForTokenClassification (LukeConfig model)
- MPNetConfig configuration class: MPNetForTokenClassification (MPNetConfig model)
- MT5Config configuration class: MT5ForTokenClassification (MT5Config model)
- MarkupLMConfig configuration class: MarkupLMForTokenClassification (MarkupLMConfig model)
- MegatronBertConfig configuration class: MegatronBertForTokenClassification (MegatronBertConfig model)
- MiniMaxConfig configuration class: MiniMaxForTokenClassification (MiniMaxConfig model)
- Ministral3Config configuration class: Ministral3ForTokenClassification (Ministral3Config model)
- MinistralConfig configuration class: MinistralForTokenClassification (MinistralConfig model)
- Mistral4Config configuration class: Mistral4ForTokenClassification (Mistral4Config model)
- MistralConfig configuration class: MistralForTokenClassification (MistralConfig model)
- MixtralConfig configuration class: MixtralForTokenClassification (MixtralConfig model)
- MobileBertConfig configuration class: MobileBertForTokenClassification (MobileBertConfig model)
- ModernBertConfig configuration class: ModernBertForTokenClassification (ModernBertConfig model)
- ModernVBertConfig configuration class: ModernVBertForTokenClassification (ModernVBertConfig model)
- MptConfig configuration class: MptForTokenClassification (MptConfig model)
- MraConfig configuration class: MraForTokenClassification (MraConfig model)
- NemotronConfig configuration class: NemotronForTokenClassification (NemotronConfig model)
- NomicBertConfig configuration class: NomicBertForTokenClassification (NomicBertConfig model)
- NystromformerConfig configuration class: NystromformerForTokenClassification (NystromformerConfig model)
- PersimmonConfig configuration class: PersimmonForTokenClassification (PersimmonConfig model)
- Phi3Config configuration class: Phi3ForTokenClassification (Phi3Config model)
- PhiConfig configuration class: PhiForTokenClassification (PhiConfig model)
- Qwen2Config configuration class: Qwen2ForTokenClassification (Qwen2Config model)
- Qwen2MoeConfig configuration class: Qwen2MoeForTokenClassification (Qwen2MoeConfig model)
- Qwen3Config configuration class: Qwen3ForTokenClassification (Qwen3Config model)
- Qwen3MoeConfig configuration class: Qwen3MoeForTokenClassification (Qwen3MoeConfig model)
- Qwen3NextConfig configuration class: Qwen3NextForTokenClassification (Qwen3NextConfig model)
- RemBertConfig configuration class: RemBertForTokenClassification (RemBertConfig model)
- RoCBertConfig configuration class: RoCBertForTokenClassification (RoCBertConfig model)
- RoFormerConfig configuration class: RoFormerForTokenClassification (RoFormerConfig model)
- RobertaConfig configuration class: RobertaForTokenClassification (RobertaConfig model)
- RobertaPreLayerNormConfig configuration class: RobertaPreLayerNormForTokenClassification (RobertaPreLayerNormConfig model)
- SeedOssConfig configuration class: SeedOssForTokenClassification (SeedOssConfig model)
- SmolLM3Config configuration class: SmolLM3ForTokenClassification (SmolLM3Config model)
- SqueezeBertConfig configuration class: SqueezeBertForTokenClassification (SqueezeBertConfig model)
- StableLmConfig configuration class: StableLmForTokenClassification (StableLmConfig model)
- Starcoder2Config configuration class: Starcoder2ForTokenClassification (Starcoder2Config model)
- T5Config configuration class: T5ForTokenClassification (T5Config model)
- T5Gemma2Config configuration class: T5Gemma2ForTokenClassification (T5Gemma2Config model)
- T5GemmaConfig configuration class: T5GemmaForTokenClassification (T5GemmaConfig model)
- UMT5Config configuration class: UMT5ForTokenClassification (UMT5Config model)
- XLMConfig configuration class: XLMForTokenClassification (XLMConfig model)
- XLMRobertaConfig configuration class: XLMRobertaForTokenClassification (XLMRobertaConfig model)
- XLMRobertaXLConfig configuration class: XLMRobertaXLForTokenClassification (XLMRobertaXLConfig model)
- XLNetConfig configuration class: XLNetForTokenClassification (XLNetConfig model)
- XmodConfig configuration class: XmodForTokenClassification (XmodConfig model)
- YosoConfig configuration class: YosoForTokenClassification (YosoConfig model)
- AlbertConfig configuration class:
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the model classes of the library (with a token classification head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the model classes of the library (with a token classification head) from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- albert —
AlbertForTokenClassification(AlbertConfig model) - apertus — ApertusForTokenClassification (ApertusConfig model)
- arcee — ArceeForTokenClassification (ArceeConfig model)
- bert — BertForTokenClassification (BertConfig model)
- big_bird — BigBirdForTokenClassification (BigBirdConfig model)
- biogpt — BioGptForTokenClassification (BioGptConfig model)
- bloom — BloomForTokenClassification (BloomConfig model)
- bros — BrosForTokenClassification (BrosConfig model)
- camembert — CamembertForTokenClassification (CamembertConfig model)
- canine — CanineForTokenClassification (CanineConfig model)
- convbert — ConvBertForTokenClassification (ConvBertConfig model)
- data2vec-text — Data2VecTextForTokenClassification (Data2VecTextConfig model)
- deberta — DebertaForTokenClassification (DebertaConfig model)
- deberta-v2 — DebertaV2ForTokenClassification (DebertaV2Config model)
- deepseek_v3 — DeepseekV3ForTokenClassification (DeepseekV3Config model)
- diffllama — DiffLlamaForTokenClassification (DiffLlamaConfig model)
- distilbert — DistilBertForTokenClassification (DistilBertConfig model)
- electra — ElectraForTokenClassification (ElectraConfig model)
- ernie — ErnieForTokenClassification (ErnieConfig model)
- esm — EsmForTokenClassification (EsmConfig model)
- eurobert — EuroBertForTokenClassification (EuroBertConfig model)
- exaone4 — Exaone4ForTokenClassification (Exaone4Config model)
- falcon — FalconForTokenClassification (FalconConfig model)
- flaubert — FlaubertForTokenClassification (FlaubertConfig model)
- fnet — FNetForTokenClassification (FNetConfig model)
- funnel — FunnelForTokenClassification (FunnelConfig model)
- gemma — GemmaForTokenClassification (GemmaConfig model)
- gemma2 — Gemma2ForTokenClassification (Gemma2Config model)
- glm — GlmForTokenClassification (GlmConfig model)
- glm4 — Glm4ForTokenClassification (Glm4Config model)
- gpt-sw3 — GPT2ForTokenClassification (GPT2Config model)
- gpt2 — GPT2ForTokenClassification (GPT2Config model)
- gpt_bigcode — GPTBigCodeForTokenClassification (GPTBigCodeConfig model)
- gpt_neo — GPTNeoForTokenClassification (GPTNeoConfig model)
- gpt_neox — GPTNeoXForTokenClassification (GPTNeoXConfig model)
- gpt_oss — GptOssForTokenClassification (GptOssConfig model)
- helium — HeliumForTokenClassification (HeliumConfig model)
- ibert — IBertForTokenClassification (IBertConfig model)
- jina_embeddings_v3 — JinaEmbeddingsV3ForTokenClassification (JinaEmbeddingsV3Config model)
- layoutlm — LayoutLMForTokenClassification (LayoutLMConfig model)
- layoutlmv2 — LayoutLMv2ForTokenClassification (LayoutLMv2Config model)
- layoutlmv3 — LayoutLMv3ForTokenClassification (LayoutLMv3Config model)
- lilt — LiltForTokenClassification (LiltConfig model)
- llama — LlamaForTokenClassification (LlamaConfig model)
- longformer — LongformerForTokenClassification (LongformerConfig model)
- luke — LukeForTokenClassification (LukeConfig model)
- markuplm — MarkupLMForTokenClassification (MarkupLMConfig model)
- megatron-bert — MegatronBertForTokenClassification (MegatronBertConfig model)
- minimax — MiniMaxForTokenClassification (MiniMaxConfig model)
- ministral — MinistralForTokenClassification (MinistralConfig model)
- ministral3 — Ministral3ForTokenClassification (Ministral3Config model)
- mistral — MistralForTokenClassification (MistralConfig model)
- mistral4 — Mistral4ForTokenClassification (Mistral4Config model)
- mixtral — MixtralForTokenClassification (MixtralConfig model)
- mobilebert — MobileBertForTokenClassification (MobileBertConfig model)
- modernbert — ModernBertForTokenClassification (ModernBertConfig model)
- modernvbert — ModernVBertForTokenClassification (ModernVBertConfig model)
- mpnet — MPNetForTokenClassification (MPNetConfig model)
- mpt — MptForTokenClassification (MptConfig model)
- mra — MraForTokenClassification (MraConfig model)
- mt5 — MT5ForTokenClassification (MT5Config model)
- nemotron — NemotronForTokenClassification (NemotronConfig model)
- nomic_bert — NomicBertForTokenClassification (NomicBertConfig model)
- nystromformer — NystromformerForTokenClassification (NystromformerConfig model)
- persimmon — PersimmonForTokenClassification (PersimmonConfig model)
- phi — PhiForTokenClassification (PhiConfig model)
- phi3 — Phi3ForTokenClassification (Phi3Config model)
- qwen2 — Qwen2ForTokenClassification (Qwen2Config model)
- qwen2_moe — Qwen2MoeForTokenClassification (Qwen2MoeConfig model)
- qwen3 — Qwen3ForTokenClassification (Qwen3Config model)
- qwen3_moe — Qwen3MoeForTokenClassification (Qwen3MoeConfig model)
- qwen3_next — Qwen3NextForTokenClassification (Qwen3NextConfig model)
- rembert — RemBertForTokenClassification (RemBertConfig model)
- roberta — RobertaForTokenClassification (RobertaConfig model)
- roberta-prelayernorm — RobertaPreLayerNormForTokenClassification (RobertaPreLayerNormConfig model)
- roc_bert — RoCBertForTokenClassification (RoCBertConfig model)
- roformer — RoFormerForTokenClassification (RoFormerConfig model)
- seed_oss — SeedOssForTokenClassification (SeedOssConfig model)
- smollm3 — SmolLM3ForTokenClassification (SmolLM3Config model)
- squeezebert — SqueezeBertForTokenClassification (SqueezeBertConfig model)
- stablelm — StableLmForTokenClassification (StableLmConfig model)
- starcoder2 — Starcoder2ForTokenClassification (Starcoder2Config model)
- t5 — T5ForTokenClassification (T5Config model)
- t5gemma — T5GemmaForTokenClassification (T5GemmaConfig model)
- t5gemma2 — T5Gemma2ForTokenClassification (T5Gemma2Config model)
- umt5 — UMT5ForTokenClassification (UMT5Config model)
- xlm — XLMForTokenClassification (XLMConfig model)
- xlm-roberta — XLMRobertaForTokenClassification (XLMRobertaConfig model)
- xlm-roberta-xl — XLMRobertaXLForTokenClassification (XLMRobertaXLConfig model)
- xlnet — XLNetForTokenClassification (XLNetConfig model)
- xmod — XmodForTokenClassification (XmodConfig model)
- yoso — YosoForTokenClassification (YosoConfig model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModelForTokenClassification
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForTokenClassification.from_pretrained("google-bert/bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForTokenClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
TrueAutoModelForQuestionAnswering
This is a generic model class that will be instantiated as one of the model classes of the library (with a question answering head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- AlbertConfig configuration class:
AlbertForQuestionAnswering(AlbertConfig model) - ArceeConfig configuration class: ArceeForQuestionAnswering (ArceeConfig model)
- BartConfig configuration class: BartForQuestionAnswering (BartConfig model)
- BertConfig configuration class: BertForQuestionAnswering (BertConfig model)
- BigBirdConfig configuration class: BigBirdForQuestionAnswering (BigBirdConfig model)
- BigBirdPegasusConfig configuration class: BigBirdPegasusForQuestionAnswering (BigBirdPegasusConfig model)
- BloomConfig configuration class: BloomForQuestionAnswering (BloomConfig model)
- CamembertConfig configuration class: CamembertForQuestionAnswering (CamembertConfig model)
- CanineConfig configuration class: CanineForQuestionAnswering (CanineConfig model)
- ConvBertConfig configuration class: ConvBertForQuestionAnswering (ConvBertConfig model)
- Data2VecTextConfig configuration class: Data2VecTextForQuestionAnswering (Data2VecTextConfig model)
- DebertaConfig configuration class: DebertaForQuestionAnswering (DebertaConfig model)
- DebertaV2Config configuration class: DebertaV2ForQuestionAnswering (DebertaV2Config model)
- DiffLlamaConfig configuration class: DiffLlamaForQuestionAnswering (DiffLlamaConfig model)
- DistilBertConfig configuration class: DistilBertForQuestionAnswering (DistilBertConfig model)
- ElectraConfig configuration class: ElectraForQuestionAnswering (ElectraConfig model)
- ErnieConfig configuration class: ErnieForQuestionAnswering (ErnieConfig model)
- Exaone4Config configuration class: Exaone4ForQuestionAnswering (Exaone4Config model)
- FNetConfig configuration class: FNetForQuestionAnswering (FNetConfig model)
- FalconConfig configuration class: FalconForQuestionAnswering (FalconConfig model)
- FlaubertConfig configuration class: FlaubertForQuestionAnsweringSimple (FlaubertConfig model)
- FunnelConfig configuration class: FunnelForQuestionAnswering (FunnelConfig model)
- GPT2Config configuration class: GPT2ForQuestionAnswering (GPT2Config model)
- GPTJConfig configuration class: GPTJForQuestionAnswering (GPTJConfig model)
- GPTNeoConfig configuration class: GPTNeoForQuestionAnswering (GPTNeoConfig model)
- GPTNeoXConfig configuration class: GPTNeoXForQuestionAnswering (GPTNeoXConfig model)
- IBertConfig configuration class: IBertForQuestionAnswering (IBertConfig model)
- JinaEmbeddingsV3Config configuration class: JinaEmbeddingsV3ForQuestionAnswering (JinaEmbeddingsV3Config model)
- LEDConfig configuration class: LEDForQuestionAnswering (LEDConfig model)
- LayoutLMv2Config configuration class: LayoutLMv2ForQuestionAnswering (LayoutLMv2Config model)
- LayoutLMv3Config configuration class: LayoutLMv3ForQuestionAnswering (LayoutLMv3Config model)
- LiltConfig configuration class: LiltForQuestionAnswering (LiltConfig model)
- LlamaConfig configuration class: LlamaForQuestionAnswering (LlamaConfig model)
- LongformerConfig configuration class: LongformerForQuestionAnswering (LongformerConfig model)
- LukeConfig configuration class: LukeForQuestionAnswering (LukeConfig model)
- LxmertConfig configuration class: LxmertForQuestionAnswering (LxmertConfig model)
- MBartConfig configuration class: MBartForQuestionAnswering (MBartConfig model)
- MPNetConfig configuration class: MPNetForQuestionAnswering (MPNetConfig model)
- MT5Config configuration class: MT5ForQuestionAnswering (MT5Config model)
- MarkupLMConfig configuration class: MarkupLMForQuestionAnswering (MarkupLMConfig model)
- MegatronBertConfig configuration class: MegatronBertForQuestionAnswering (MegatronBertConfig model)
- MiniMaxConfig configuration class: MiniMaxForQuestionAnswering (MiniMaxConfig model)
- Ministral3Config configuration class: Ministral3ForQuestionAnswering (Ministral3Config model)
- MinistralConfig configuration class: MinistralForQuestionAnswering (MinistralConfig model)
- MistralConfig configuration class: MistralForQuestionAnswering (MistralConfig model)
- MixtralConfig configuration class: MixtralForQuestionAnswering (MixtralConfig model)
- MobileBertConfig configuration class: MobileBertForQuestionAnswering (MobileBertConfig model)
- ModernBertConfig configuration class: ModernBertForQuestionAnswering (ModernBertConfig model)
- MptConfig configuration class: MptForQuestionAnswering (MptConfig model)
- MraConfig configuration class: MraForQuestionAnswering (MraConfig model)
- MvpConfig configuration class: MvpForQuestionAnswering (MvpConfig model)
- NemotronConfig configuration class: NemotronForQuestionAnswering (NemotronConfig model)
- NystromformerConfig configuration class: NystromformerForQuestionAnswering (NystromformerConfig model)
- OPTConfig configuration class: OPTForQuestionAnswering (OPTConfig model)
- Qwen2Config configuration class: Qwen2ForQuestionAnswering (Qwen2Config model)
- Qwen2MoeConfig configuration class: Qwen2MoeForQuestionAnswering (Qwen2MoeConfig model)
- Qwen3Config configuration class: Qwen3ForQuestionAnswering (Qwen3Config model)
- Qwen3MoeConfig configuration class: Qwen3MoeForQuestionAnswering (Qwen3MoeConfig model)
- Qwen3NextConfig configuration class: Qwen3NextForQuestionAnswering (Qwen3NextConfig model)
- ReformerConfig configuration class: ReformerForQuestionAnswering (ReformerConfig model)
- RemBertConfig configuration class: RemBertForQuestionAnswering (RemBertConfig model)
- RoCBertConfig configuration class: RoCBertForQuestionAnswering (RoCBertConfig model)
- RoFormerConfig configuration class: RoFormerForQuestionAnswering (RoFormerConfig model)
- RobertaConfig configuration class: RobertaForQuestionAnswering (RobertaConfig model)
- RobertaPreLayerNormConfig configuration class: RobertaPreLayerNormForQuestionAnswering (RobertaPreLayerNormConfig model)
- SeedOssConfig configuration class: SeedOssForQuestionAnswering (SeedOssConfig model)
- SmolLM3Config configuration class: SmolLM3ForQuestionAnswering (SmolLM3Config model)
- SplinterConfig configuration class: SplinterForQuestionAnswering (SplinterConfig model)
- SqueezeBertConfig configuration class: SqueezeBertForQuestionAnswering (SqueezeBertConfig model)
- T5Config configuration class: T5ForQuestionAnswering (T5Config model)
- UMT5Config configuration class: UMT5ForQuestionAnswering (UMT5Config model)
- XLMConfig configuration class: XLMForQuestionAnsweringSimple (XLMConfig model)
- XLMRobertaConfig configuration class: XLMRobertaForQuestionAnswering (XLMRobertaConfig model)
- XLMRobertaXLConfig configuration class: XLMRobertaXLForQuestionAnswering (XLMRobertaXLConfig model)
- XLNetConfig configuration class: XLNetForQuestionAnsweringSimple (XLNetConfig model)
- XmodConfig configuration class: XmodForQuestionAnswering (XmodConfig model)
- YosoConfig configuration class: YosoForQuestionAnswering (YosoConfig model)
- AlbertConfig configuration class:
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the model classes of the library (with a question answering head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the model classes of the library (with a question answering head) from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- albert —
AlbertForQuestionAnswering(AlbertConfig model) - arcee — ArceeForQuestionAnswering (ArceeConfig model)
- bart — BartForQuestionAnswering (BartConfig model)
- bert — BertForQuestionAnswering (BertConfig model)
- big_bird — BigBirdForQuestionAnswering (BigBirdConfig model)
- bigbird_pegasus — BigBirdPegasusForQuestionAnswering (BigBirdPegasusConfig model)
- bloom — BloomForQuestionAnswering (BloomConfig model)
- camembert — CamembertForQuestionAnswering (CamembertConfig model)
- canine — CanineForQuestionAnswering (CanineConfig model)
- convbert — ConvBertForQuestionAnswering (ConvBertConfig model)
- data2vec-text — Data2VecTextForQuestionAnswering (Data2VecTextConfig model)
- deberta — DebertaForQuestionAnswering (DebertaConfig model)
- deberta-v2 — DebertaV2ForQuestionAnswering (DebertaV2Config model)
- diffllama — DiffLlamaForQuestionAnswering (DiffLlamaConfig model)
- distilbert — DistilBertForQuestionAnswering (DistilBertConfig model)
- electra — ElectraForQuestionAnswering (ElectraConfig model)
- ernie — ErnieForQuestionAnswering (ErnieConfig model)
- exaone4 — Exaone4ForQuestionAnswering (Exaone4Config model)
- falcon — FalconForQuestionAnswering (FalconConfig model)
- flaubert — FlaubertForQuestionAnsweringSimple (FlaubertConfig model)
- fnet — FNetForQuestionAnswering (FNetConfig model)
- funnel — FunnelForQuestionAnswering (FunnelConfig model)
- gpt2 — GPT2ForQuestionAnswering (GPT2Config model)
- gpt_neo — GPTNeoForQuestionAnswering (GPTNeoConfig model)
- gpt_neox — GPTNeoXForQuestionAnswering (GPTNeoXConfig model)
- gptj — GPTJForQuestionAnswering (GPTJConfig model)
- ibert — IBertForQuestionAnswering (IBertConfig model)
- jina_embeddings_v3 — JinaEmbeddingsV3ForQuestionAnswering (JinaEmbeddingsV3Config model)
- layoutlmv2 — LayoutLMv2ForQuestionAnswering (LayoutLMv2Config model)
- layoutlmv3 — LayoutLMv3ForQuestionAnswering (LayoutLMv3Config model)
- led — LEDForQuestionAnswering (LEDConfig model)
- lilt — LiltForQuestionAnswering (LiltConfig model)
- llama — LlamaForQuestionAnswering (LlamaConfig model)
- longformer — LongformerForQuestionAnswering (LongformerConfig model)
- luke — LukeForQuestionAnswering (LukeConfig model)
- lxmert — LxmertForQuestionAnswering (LxmertConfig model)
- markuplm — MarkupLMForQuestionAnswering (MarkupLMConfig model)
- mbart — MBartForQuestionAnswering (MBartConfig model)
- megatron-bert — MegatronBertForQuestionAnswering (MegatronBertConfig model)
- minimax — MiniMaxForQuestionAnswering (MiniMaxConfig model)
- ministral — MinistralForQuestionAnswering (MinistralConfig model)
- ministral3 — Ministral3ForQuestionAnswering (Ministral3Config model)
- mistral — MistralForQuestionAnswering (MistralConfig model)
- mixtral — MixtralForQuestionAnswering (MixtralConfig model)
- mobilebert — MobileBertForQuestionAnswering (MobileBertConfig model)
- modernbert — ModernBertForQuestionAnswering (ModernBertConfig model)
- mpnet — MPNetForQuestionAnswering (MPNetConfig model)
- mpt — MptForQuestionAnswering (MptConfig model)
- mra — MraForQuestionAnswering (MraConfig model)
- mt5 — MT5ForQuestionAnswering (MT5Config model)
- mvp — MvpForQuestionAnswering (MvpConfig model)
- nemotron — NemotronForQuestionAnswering (NemotronConfig model)
- nystromformer — NystromformerForQuestionAnswering (NystromformerConfig model)
- opt — OPTForQuestionAnswering (OPTConfig model)
- qwen2 — Qwen2ForQuestionAnswering (Qwen2Config model)
- qwen2_moe — Qwen2MoeForQuestionAnswering (Qwen2MoeConfig model)
- qwen3 — Qwen3ForQuestionAnswering (Qwen3Config model)
- qwen3_moe — Qwen3MoeForQuestionAnswering (Qwen3MoeConfig model)
- qwen3_next — Qwen3NextForQuestionAnswering (Qwen3NextConfig model)
- reformer — ReformerForQuestionAnswering (ReformerConfig model)
- rembert — RemBertForQuestionAnswering (RemBertConfig model)
- roberta — RobertaForQuestionAnswering (RobertaConfig model)
- roberta-prelayernorm — RobertaPreLayerNormForQuestionAnswering (RobertaPreLayerNormConfig model)
- roc_bert — RoCBertForQuestionAnswering (RoCBertConfig model)
- roformer — RoFormerForQuestionAnswering (RoFormerConfig model)
- seed_oss — SeedOssForQuestionAnswering (SeedOssConfig model)
- smollm3 — SmolLM3ForQuestionAnswering (SmolLM3Config model)
- splinter — SplinterForQuestionAnswering (SplinterConfig model)
- squeezebert — SqueezeBertForQuestionAnswering (SqueezeBertConfig model)
- t5 — T5ForQuestionAnswering (T5Config model)
- umt5 — UMT5ForQuestionAnswering (UMT5Config model)
- xlm — XLMForQuestionAnsweringSimple (XLMConfig model)
- xlm-roberta — XLMRobertaForQuestionAnswering (XLMRobertaConfig model)
- xlm-roberta-xl — XLMRobertaXLForQuestionAnswering (XLMRobertaXLConfig model)
- xlnet — XLNetForQuestionAnsweringSimple (XLNetConfig model)
- xmod — XmodForQuestionAnswering (XmodConfig model)
- yoso — YosoForQuestionAnswering (YosoConfig model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModelForQuestionAnswering
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForQuestionAnswering.from_pretrained("google-bert/bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForQuestionAnswering.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
TrueAutoModelForTextEncoding
Computer vision
The following auto classes are available for the following computer vision tasks.
AutoModelForDepthEstimation
This is a generic model class that will be instantiated as one of the model classes of the library (with a depth estimation head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- CHMv2Config configuration class: CHMv2ForDepthEstimation (CHMv2Config model)
- DPTConfig configuration class: DPTForDepthEstimation (DPTConfig model)
- DepthAnythingConfig configuration class: DepthAnythingForDepthEstimation (DepthAnythingConfig model)
- DepthProConfig configuration class: DepthProForDepthEstimation (DepthProConfig model)
- GLPNConfig configuration class: GLPNForDepthEstimation (GLPNConfig model)
- PromptDepthAnythingConfig configuration class: PromptDepthAnythingForDepthEstimation (PromptDepthAnythingConfig model)
- ZoeDepthConfig configuration class: ZoeDepthForDepthEstimation (ZoeDepthConfig model)
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the model classes of the library (with a depth estimation head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the model classes of the library (with a depth estimation head) from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- chmv2 — CHMv2ForDepthEstimation (CHMv2Config model)
- depth_anything — DepthAnythingForDepthEstimation (DepthAnythingConfig model)
- depth_pro — DepthProForDepthEstimation (DepthProConfig model)
- dpt — DPTForDepthEstimation (DPTConfig model)
- glpn — GLPNForDepthEstimation (GLPNConfig model)
- prompt_depth_anything — PromptDepthAnythingForDepthEstimation (PromptDepthAnythingConfig model)
- zoedepth — ZoeDepthForDepthEstimation (ZoeDepthConfig model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModelForDepthEstimation
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForDepthEstimation.from_pretrained("google-bert/bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForDepthEstimation.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
TrueAutoModelForTextRecognition
This is a generic model class that will be instantiated as one of the model classes of the library (with a text recognition head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- PPOCRV5MobileRecConfig configuration class: PPOCRV5MobileRecForTextRecognition (PPOCRV5MobileRecConfig model)
- PPOCRV5ServerRecConfig configuration class: PPOCRV5ServerRecForTextRecognition (PPOCRV5ServerRecConfig model)
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the model classes of the library (with a text recognition head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the model classes of the library (with a text recognition head) from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- pp_ocrv5_mobile_rec — PPOCRV5MobileRecForTextRecognition (PPOCRV5MobileRecConfig model)
- pp_ocrv5_server_rec — PPOCRV5ServerRecForTextRecognition (PPOCRV5ServerRecConfig model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModelForTextRecognition
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForTextRecognition.from_pretrained("google-bert/bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForTextRecognition.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
TrueAutoModelForTableRecognition
This is a generic model class that will be instantiated as one of the model classes of the library (with a table recognition head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- SLANeXtConfig configuration class: SLANeXtForTableRecognition (SLANeXtConfig model)
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the model classes of the library (with a table recognition head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the model classes of the library (with a table recognition head) from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- slanext — SLANeXtForTableRecognition (SLANeXtConfig model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModelForTableRecognition
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForTableRecognition.from_pretrained("google-bert/bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForTableRecognition.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
TrueAutoModelForImageClassification
This is a generic model class that will be instantiated as one of the model classes of the library (with a image classification head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- BeitConfig configuration class: BeitForImageClassification (BeitConfig model)
- BitConfig configuration class: BitForImageClassification (BitConfig model)
- CLIPConfig configuration class: CLIPForImageClassification (CLIPConfig model)
- ConvNextConfig configuration class: ConvNextForImageClassification (ConvNextConfig model)
- ConvNextV2Config configuration class: ConvNextV2ForImageClassification (ConvNextV2Config model)
- CvtConfig configuration class: CvtForImageClassification (CvtConfig model)
- Data2VecVisionConfig configuration class: Data2VecVisionForImageClassification (Data2VecVisionConfig model)
- DeiTConfig configuration class: DeiTForImageClassification or DeiTForImageClassificationWithTeacher (DeiTConfig model)
- DinatConfig configuration class: DinatForImageClassification (DinatConfig model)
- Dinov2Config configuration class: Dinov2ForImageClassification (Dinov2Config model)
- Dinov2WithRegistersConfig configuration class: Dinov2WithRegistersForImageClassification (Dinov2WithRegistersConfig model)
- DonutSwinConfig configuration class: DonutSwinForImageClassification (DonutSwinConfig model)
- EfficientNetConfig configuration class: EfficientNetForImageClassification (EfficientNetConfig model)
- FocalNetConfig configuration class: FocalNetForImageClassification (FocalNetConfig model)
- HGNetV2Config configuration class: HGNetV2ForImageClassification (HGNetV2Config model)
- HieraConfig configuration class: HieraForImageClassification (HieraConfig model)
- IJepaConfig configuration class: IJepaForImageClassification (IJepaConfig model)
- ImageGPTConfig configuration class: ImageGPTForImageClassification (ImageGPTConfig model)
- LevitConfig configuration class: LevitForImageClassification or LevitForImageClassificationWithTeacher (LevitConfig model)
- MetaClip2Config configuration class: MetaClip2ForImageClassification (MetaClip2Config model)
- MobileNetV1Config configuration class: MobileNetV1ForImageClassification (MobileNetV1Config model)
- MobileNetV2Config configuration class: MobileNetV2ForImageClassification (MobileNetV2Config model)
- MobileViTConfig configuration class: MobileViTForImageClassification (MobileViTConfig model)
- MobileViTV2Config configuration class: MobileViTV2ForImageClassification (MobileViTV2Config model)
- PPLCNetConfig configuration class: PPLCNetForImageClassification (PPLCNetConfig model)
- PerceiverConfig configuration class: PerceiverForImageClassificationLearned or PerceiverForImageClassificationFourier or PerceiverForImageClassificationConvProcessing (PerceiverConfig model)
- PoolFormerConfig configuration class: PoolFormerForImageClassification (PoolFormerConfig model)
- PvtConfig configuration class: PvtForImageClassification (PvtConfig model)
- PvtV2Config configuration class: PvtV2ForImageClassification (PvtV2Config model)
- RegNetConfig configuration class: RegNetForImageClassification (RegNetConfig model)
- ResNetConfig configuration class: ResNetForImageClassification (ResNetConfig model)
- SegformerConfig configuration class: SegformerForImageClassification (SegformerConfig model)
- ShieldGemma2Config configuration class: ShieldGemma2ForImageClassification (ShieldGemma2Config model)
- Siglip2Config configuration class: Siglip2ForImageClassification (Siglip2Config model)
- SiglipConfig configuration class: SiglipForImageClassification (SiglipConfig model)
- SwiftFormerConfig configuration class: SwiftFormerForImageClassification (SwiftFormerConfig model)
- SwinConfig configuration class: SwinForImageClassification (SwinConfig model)
- Swinv2Config configuration class: Swinv2ForImageClassification (Swinv2Config model)
- TextNetConfig configuration class: TextNetForImageClassification (TextNetConfig model)
- TimmWrapperConfig configuration class: TimmWrapperForImageClassification (TimmWrapperConfig model)
- ViTConfig configuration class: ViTForImageClassification (ViTConfig model)
- ViTMSNConfig configuration class: ViTMSNForImageClassification (ViTMSNConfig model)
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the model classes of the library (with a image classification head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the model classes of the library (with a image classification head) from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- beit — BeitForImageClassification (BeitConfig model)
- bit — BitForImageClassification (BitConfig model)
- clip — CLIPForImageClassification (CLIPConfig model)
- convnext — ConvNextForImageClassification (ConvNextConfig model)
- convnextv2 — ConvNextV2ForImageClassification (ConvNextV2Config model)
- cvt — CvtForImageClassification (CvtConfig model)
- data2vec-vision — Data2VecVisionForImageClassification (Data2VecVisionConfig model)
- deit — DeiTForImageClassification or DeiTForImageClassificationWithTeacher (DeiTConfig model)
- dinat — DinatForImageClassification (DinatConfig model)
- dinov2 — Dinov2ForImageClassification (Dinov2Config model)
- dinov2_with_registers — Dinov2WithRegistersForImageClassification (Dinov2WithRegistersConfig model)
- donut-swin — DonutSwinForImageClassification (DonutSwinConfig model)
- efficientnet — EfficientNetForImageClassification (EfficientNetConfig model)
- focalnet — FocalNetForImageClassification (FocalNetConfig model)
- hgnet_v2 — HGNetV2ForImageClassification (HGNetV2Config model)
- hiera — HieraForImageClassification (HieraConfig model)
- ijepa — IJepaForImageClassification (IJepaConfig model)
- imagegpt — ImageGPTForImageClassification (ImageGPTConfig model)
- levit — LevitForImageClassification or LevitForImageClassificationWithTeacher (LevitConfig model)
- metaclip_2 — MetaClip2ForImageClassification (MetaClip2Config model)
- mobilenet_v1 — MobileNetV1ForImageClassification (MobileNetV1Config model)
- mobilenet_v2 — MobileNetV2ForImageClassification (MobileNetV2Config model)
- mobilevit — MobileViTForImageClassification (MobileViTConfig model)
- mobilevitv2 — MobileViTV2ForImageClassification (MobileViTV2Config model)
- perceiver — PerceiverForImageClassificationLearned or PerceiverForImageClassificationFourier or PerceiverForImageClassificationConvProcessing (PerceiverConfig model)
- poolformer — PoolFormerForImageClassification (PoolFormerConfig model)
- pp_lcnet — PPLCNetForImageClassification (PPLCNetConfig model)
- pvt — PvtForImageClassification (PvtConfig model)
- pvt_v2 — PvtV2ForImageClassification (PvtV2Config model)
- regnet — RegNetForImageClassification (RegNetConfig model)
- resnet — ResNetForImageClassification (ResNetConfig model)
- segformer — SegformerForImageClassification (SegformerConfig model)
- shieldgemma2 — ShieldGemma2ForImageClassification (ShieldGemma2Config model)
- siglip — SiglipForImageClassification (SiglipConfig model)
- siglip2 — Siglip2ForImageClassification (Siglip2Config model)
- swiftformer — SwiftFormerForImageClassification (SwiftFormerConfig model)
- swin — SwinForImageClassification (SwinConfig model)
- swinv2 — Swinv2ForImageClassification (Swinv2Config model)
- textnet — TextNetForImageClassification (TextNetConfig model)
- timm_wrapper — TimmWrapperForImageClassification (TimmWrapperConfig model)
- vit — ViTForImageClassification (ViTConfig model)
- vit_msn — ViTMSNForImageClassification (ViTMSNConfig model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModelForImageClassification
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForImageClassification.from_pretrained("google-bert/bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForImageClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
TrueAutoModelForVideoClassification
This is a generic model class that will be instantiated as one of the model classes of the library (with a video classification head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- TimesformerConfig configuration class: TimesformerForVideoClassification (TimesformerConfig model)
- VJEPA2Config configuration class: VJEPA2ForVideoClassification (VJEPA2Config model)
- VideoMAEConfig configuration class: VideoMAEForVideoClassification (VideoMAEConfig model)
- VivitConfig configuration class: VivitForVideoClassification (VivitConfig model)
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the model classes of the library (with a video classification head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the model classes of the library (with a video classification head) from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- timesformer — TimesformerForVideoClassification (TimesformerConfig model)
- videomae — VideoMAEForVideoClassification (VideoMAEConfig model)
- vivit — VivitForVideoClassification (VivitConfig model)
- vjepa2 — VJEPA2ForVideoClassification (VJEPA2Config model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModelForVideoClassification
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForVideoClassification.from_pretrained("google-bert/bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForVideoClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
TrueAutoModelForKeypointDetection
AutoModelForKeypointMatching
AutoModelForMaskedImageModeling
This is a generic model class that will be instantiated as one of the model classes of the library (with a masked image modeling head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- DeiTConfig configuration class: DeiTForMaskedImageModeling (DeiTConfig model)
- FocalNetConfig configuration class: FocalNetForMaskedImageModeling (FocalNetConfig model)
- SwinConfig configuration class: SwinForMaskedImageModeling (SwinConfig model)
- Swinv2Config configuration class: Swinv2ForMaskedImageModeling (Swinv2Config model)
- ViTConfig configuration class: ViTForMaskedImageModeling (ViTConfig model)
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the model classes of the library (with a masked image modeling head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the model classes of the library (with a masked image modeling head) from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- deit — DeiTForMaskedImageModeling (DeiTConfig model)
- focalnet — FocalNetForMaskedImageModeling (FocalNetConfig model)
- swin — SwinForMaskedImageModeling (SwinConfig model)
- swinv2 — Swinv2ForMaskedImageModeling (Swinv2Config model)
- vit — ViTForMaskedImageModeling (ViTConfig model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModelForMaskedImageModeling
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForMaskedImageModeling.from_pretrained("google-bert/bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForMaskedImageModeling.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
TrueAutoModelForObjectDetection
This is a generic model class that will be instantiated as one of the model classes of the library (with a object detection head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- ConditionalDetrConfig configuration class: ConditionalDetrForObjectDetection (ConditionalDetrConfig model)
- DFineConfig configuration class: DFineForObjectDetection (DFineConfig model)
- DabDetrConfig configuration class: DabDetrForObjectDetection (DabDetrConfig model)
- DeformableDetrConfig configuration class: DeformableDetrForObjectDetection (DeformableDetrConfig model)
- DetrConfig configuration class: DetrForObjectDetection (DetrConfig model)
- LwDetrConfig configuration class: LwDetrForObjectDetection (LwDetrConfig model)
- PPDocLayoutV2Config configuration class: PPDocLayoutV2ForObjectDetection (PPDocLayoutV2Config model)
- PPDocLayoutV3Config configuration class: PPDocLayoutV3ForObjectDetection (PPDocLayoutV3Config model)
- PPOCRV5MobileDetConfig configuration class: PPOCRV5MobileDetForObjectDetection (PPOCRV5MobileDetConfig model)
- PPOCRV5ServerDetConfig configuration class: PPOCRV5ServerDetForObjectDetection (PPOCRV5ServerDetConfig model)
- RTDetrConfig configuration class: RTDetrForObjectDetection (RTDetrConfig model)
- RTDetrV2Config configuration class: RTDetrV2ForObjectDetection (RTDetrV2Config model)
- TableTransformerConfig configuration class: TableTransformerForObjectDetection (TableTransformerConfig model)
- YolosConfig configuration class: YolosForObjectDetection (YolosConfig model)
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the model classes of the library (with a object detection head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the model classes of the library (with a object detection head) from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- conditional_detr — ConditionalDetrForObjectDetection (ConditionalDetrConfig model)
- d_fine — DFineForObjectDetection (DFineConfig model)
- dab-detr — DabDetrForObjectDetection (DabDetrConfig model)
- deformable_detr — DeformableDetrForObjectDetection (DeformableDetrConfig model)
- detr — DetrForObjectDetection (DetrConfig model)
- lw_detr — LwDetrForObjectDetection (LwDetrConfig model)
- pp_doclayout_v2 — PPDocLayoutV2ForObjectDetection (PPDocLayoutV2Config model)
- pp_doclayout_v3 — PPDocLayoutV3ForObjectDetection (PPDocLayoutV3Config model)
- pp_ocrv5_mobile_det — PPOCRV5MobileDetForObjectDetection (PPOCRV5MobileDetConfig model)
- pp_ocrv5_server_det — PPOCRV5ServerDetForObjectDetection (PPOCRV5ServerDetConfig model)
- rt_detr — RTDetrForObjectDetection (RTDetrConfig model)
- rt_detr_v2 — RTDetrV2ForObjectDetection (RTDetrV2Config model)
- table-transformer — TableTransformerForObjectDetection (TableTransformerConfig model)
- yolos — YolosForObjectDetection (YolosConfig model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModelForObjectDetection
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForObjectDetection.from_pretrained("google-bert/bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForObjectDetection.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
TrueAutoModelForImageSegmentation
This is a generic model class that will be instantiated as one of the model classes of the library (with a image segmentation head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- DetrConfig configuration class: DetrForSegmentation (DetrConfig model)
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the model classes of the library (with a image segmentation head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the model classes of the library (with a image segmentation head) from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- detr — DetrForSegmentation (DetrConfig model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModelForImageSegmentation
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForImageSegmentation.from_pretrained("google-bert/bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForImageSegmentation.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
TrueAutoModelForImageToImage
AutoModelForSemanticSegmentation
This is a generic model class that will be instantiated as one of the model classes of the library (with a semantic segmentation head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- BeitConfig configuration class: BeitForSemanticSegmentation (BeitConfig model)
- DPTConfig configuration class: DPTForSemanticSegmentation (DPTConfig model)
- Data2VecVisionConfig configuration class: Data2VecVisionForSemanticSegmentation (Data2VecVisionConfig model)
- MobileNetV2Config configuration class: MobileNetV2ForSemanticSegmentation (MobileNetV2Config model)
- MobileViTConfig configuration class: MobileViTForSemanticSegmentation (MobileViTConfig model)
- MobileViTV2Config configuration class: MobileViTV2ForSemanticSegmentation (MobileViTV2Config model)
- SegformerConfig configuration class: SegformerForSemanticSegmentation (SegformerConfig model)
- UperNetConfig configuration class: UperNetForSemanticSegmentation (UperNetConfig model)
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the model classes of the library (with a semantic segmentation head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the model classes of the library (with a semantic segmentation head) from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- beit — BeitForSemanticSegmentation (BeitConfig model)
- data2vec-vision — Data2VecVisionForSemanticSegmentation (Data2VecVisionConfig model)
- dpt — DPTForSemanticSegmentation (DPTConfig model)
- mobilenet_v2 — MobileNetV2ForSemanticSegmentation (MobileNetV2Config model)
- mobilevit — MobileViTForSemanticSegmentation (MobileViTConfig model)
- mobilevitv2 — MobileViTV2ForSemanticSegmentation (MobileViTV2Config model)
- segformer — SegformerForSemanticSegmentation (SegformerConfig model)
- upernet — UperNetForSemanticSegmentation (UperNetConfig model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModelForSemanticSegmentation
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForSemanticSegmentation.from_pretrained("google-bert/bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForSemanticSegmentation.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
TrueAutoModelForInstanceSegmentation
This is a generic model class that will be instantiated as one of the model classes of the library (with a instance segmentation head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- MaskFormerConfig configuration class: MaskFormerForInstanceSegmentation (MaskFormerConfig model)
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the model classes of the library (with a instance segmentation head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the model classes of the library (with a instance segmentation head) from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- maskformer — MaskFormerForInstanceSegmentation (MaskFormerConfig model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModelForInstanceSegmentation
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForInstanceSegmentation.from_pretrained("google-bert/bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForInstanceSegmentation.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
TrueAutoModelForUniversalSegmentation
This is a generic model class that will be instantiated as one of the model classes of the library (with a universal image segmentation head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- DetrConfig configuration class: DetrForSegmentation (DetrConfig model)
- EomtConfig configuration class: EomtForUniversalSegmentation (EomtConfig model)
- EomtDinov3Config configuration class: EomtDinov3ForUniversalSegmentation (EomtDinov3Config model)
- Mask2FormerConfig configuration class: Mask2FormerForUniversalSegmentation (Mask2FormerConfig model)
- MaskFormerConfig configuration class: MaskFormerForInstanceSegmentation (MaskFormerConfig model)
- OneFormerConfig configuration class: OneFormerForUniversalSegmentation (OneFormerConfig model)
- VideomtConfig configuration class: VideomtForUniversalSegmentation (VideomtConfig model)
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the model classes of the library (with a universal image segmentation head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the model classes of the library (with a universal image segmentation head) from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- detr — DetrForSegmentation (DetrConfig model)
- eomt — EomtForUniversalSegmentation (EomtConfig model)
- eomt_dinov3 — EomtDinov3ForUniversalSegmentation (EomtDinov3Config model)
- mask2former — Mask2FormerForUniversalSegmentation (Mask2FormerConfig model)
- maskformer — MaskFormerForInstanceSegmentation (MaskFormerConfig model)
- oneformer — OneFormerForUniversalSegmentation (OneFormerConfig model)
- videomt — VideomtForUniversalSegmentation (VideomtConfig model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModelForUniversalSegmentation
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForUniversalSegmentation.from_pretrained("google-bert/bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForUniversalSegmentation.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
TrueAutoModelForZeroShotImageClassification
This is a generic model class that will be instantiated as one of the model classes of the library (with a zero-shot image classification head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- AlignConfig configuration class: AlignModel (AlignConfig model)
- AltCLIPConfig configuration class: AltCLIPModel (AltCLIPConfig model)
- Blip2Config configuration class: Blip2ForImageTextRetrieval (Blip2Config model)
- BlipConfig configuration class: BlipModel (BlipConfig model)
- CLIPConfig configuration class: CLIPModel (CLIPConfig model)
- CLIPSegConfig configuration class: CLIPSegModel (CLIPSegConfig model)
- ChineseCLIPConfig configuration class: ChineseCLIPModel (ChineseCLIPConfig model)
- MetaClip2Config configuration class: MetaClip2Model (MetaClip2Config model)
- Siglip2Config configuration class: Siglip2Model (Siglip2Config model)
- SiglipConfig configuration class: SiglipModel (SiglipConfig model)
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the model classes of the library (with a zero-shot image classification head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the model classes of the library (with a zero-shot image classification head) from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- align — AlignModel (AlignConfig model)
- altclip — AltCLIPModel (AltCLIPConfig model)
- blip — BlipModel (BlipConfig model)
- blip-2 — Blip2ForImageTextRetrieval (Blip2Config model)
- chinese_clip — ChineseCLIPModel (ChineseCLIPConfig model)
- clip — CLIPModel (CLIPConfig model)
- clipseg — CLIPSegModel (CLIPSegConfig model)
- metaclip_2 — MetaClip2Model (MetaClip2Config model)
- siglip — SiglipModel (SiglipConfig model)
- siglip2 — Siglip2Model (Siglip2Config model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModelForZeroShotImageClassification
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForZeroShotImageClassification.from_pretrained("google-bert/bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForZeroShotImageClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
TrueAutoModelForZeroShotObjectDetection
This is a generic model class that will be instantiated as one of the model classes of the library (with a zero-shot object detection head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- GroundingDinoConfig configuration class: GroundingDinoForObjectDetection (GroundingDinoConfig model)
- MMGroundingDinoConfig configuration class: MMGroundingDinoForObjectDetection (MMGroundingDinoConfig model)
- OmDetTurboConfig configuration class: OmDetTurboForObjectDetection (OmDetTurboConfig model)
- OwlViTConfig configuration class: OwlViTForObjectDetection (OwlViTConfig model)
- Owlv2Config configuration class: Owlv2ForObjectDetection (Owlv2Config model)
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the model classes of the library (with a zero-shot object detection head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the model classes of the library (with a zero-shot object detection head) from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- grounding-dino — GroundingDinoForObjectDetection (GroundingDinoConfig model)
- mm-grounding-dino — MMGroundingDinoForObjectDetection (MMGroundingDinoConfig model)
- omdet-turbo — OmDetTurboForObjectDetection (OmDetTurboConfig model)
- owlv2 — Owlv2ForObjectDetection (Owlv2Config model)
- owlvit — OwlViTForObjectDetection (OwlViTConfig model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModelForZeroShotObjectDetection
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForZeroShotObjectDetection.from_pretrained("google-bert/bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForZeroShotObjectDetection.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
TrueAudio
The following auto classes are available for the following audio tasks.
AutoModelForAudioClassification
This is a generic model class that will be instantiated as one of the model classes of the library (with a audio classification head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- ASTConfig configuration class: ASTForAudioClassification (ASTConfig model)
- Data2VecAudioConfig configuration class: Data2VecAudioForSequenceClassification (Data2VecAudioConfig model)
- HubertConfig configuration class: HubertForSequenceClassification (HubertConfig model)
- SEWConfig configuration class: SEWForSequenceClassification (SEWConfig model)
- SEWDConfig configuration class: SEWDForSequenceClassification (SEWDConfig model)
- UniSpeechConfig configuration class: UniSpeechForSequenceClassification (UniSpeechConfig model)
- UniSpeechSatConfig configuration class: UniSpeechSatForSequenceClassification (UniSpeechSatConfig model)
- Wav2Vec2BertConfig configuration class: Wav2Vec2BertForSequenceClassification (Wav2Vec2BertConfig model)
- Wav2Vec2Config configuration class: Wav2Vec2ForSequenceClassification (Wav2Vec2Config model)
- Wav2Vec2ConformerConfig configuration class: Wav2Vec2ConformerForSequenceClassification (Wav2Vec2ConformerConfig model)
- WavLMConfig configuration class: WavLMForSequenceClassification (WavLMConfig model)
- WhisperConfig configuration class: WhisperForAudioClassification (WhisperConfig model)
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the model classes of the library (with a audio classification head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the model classes of the library (with a audio classification head) from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- audio-spectrogram-transformer — ASTForAudioClassification (ASTConfig model)
- data2vec-audio — Data2VecAudioForSequenceClassification (Data2VecAudioConfig model)
- hubert — HubertForSequenceClassification (HubertConfig model)
- sew — SEWForSequenceClassification (SEWConfig model)
- sew-d — SEWDForSequenceClassification (SEWDConfig model)
- unispeech — UniSpeechForSequenceClassification (UniSpeechConfig model)
- unispeech-sat — UniSpeechSatForSequenceClassification (UniSpeechSatConfig model)
- wav2vec2 — Wav2Vec2ForSequenceClassification (Wav2Vec2Config model)
- wav2vec2-bert — Wav2Vec2BertForSequenceClassification (Wav2Vec2BertConfig model)
- wav2vec2-conformer — Wav2Vec2ConformerForSequenceClassification (Wav2Vec2ConformerConfig model)
- wavlm — WavLMForSequenceClassification (WavLMConfig model)
- whisper — WhisperForAudioClassification (WhisperConfig model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModelForAudioClassification
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForAudioClassification.from_pretrained("google-bert/bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForAudioClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
TrueAutoModelForAudioFrameClassification
This is a generic model class that will be instantiated as one of the model classes of the library (with a audio frame (token) classification head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- Data2VecAudioConfig configuration class: Data2VecAudioForAudioFrameClassification (Data2VecAudioConfig model)
- UniSpeechSatConfig configuration class: UniSpeechSatForAudioFrameClassification (UniSpeechSatConfig model)
- Wav2Vec2BertConfig configuration class: Wav2Vec2BertForAudioFrameClassification (Wav2Vec2BertConfig model)
- Wav2Vec2Config configuration class: Wav2Vec2ForAudioFrameClassification (Wav2Vec2Config model)
- Wav2Vec2ConformerConfig configuration class: Wav2Vec2ConformerForAudioFrameClassification (Wav2Vec2ConformerConfig model)
- WavLMConfig configuration class: WavLMForAudioFrameClassification (WavLMConfig model)
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the model classes of the library (with a audio frame (token) classification head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the model classes of the library (with a audio frame (token) classification head) from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- data2vec-audio — Data2VecAudioForAudioFrameClassification (Data2VecAudioConfig model)
- unispeech-sat — UniSpeechSatForAudioFrameClassification (UniSpeechSatConfig model)
- wav2vec2 — Wav2Vec2ForAudioFrameClassification (Wav2Vec2Config model)
- wav2vec2-bert — Wav2Vec2BertForAudioFrameClassification (Wav2Vec2BertConfig model)
- wav2vec2-conformer — Wav2Vec2ConformerForAudioFrameClassification (Wav2Vec2ConformerConfig model)
- wavlm — WavLMForAudioFrameClassification (WavLMConfig model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModelForAudioFrameClassification
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForAudioFrameClassification.from_pretrained("google-bert/bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForAudioFrameClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
TrueAutoModelForCTC
This is a generic model class that will be instantiated as one of the model classes of the library (with a connectionist temporal classification head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- Data2VecAudioConfig configuration class: Data2VecAudioForCTC (Data2VecAudioConfig model)
- HubertConfig configuration class: HubertForCTC (HubertConfig model)
- LasrCTCConfig configuration class: LasrForCTC (LasrCTCConfig model)
- ParakeetCTCConfig configuration class: ParakeetForCTC (ParakeetCTCConfig model)
- SEWConfig configuration class: SEWForCTC (SEWConfig model)
- SEWDConfig configuration class: SEWDForCTC (SEWDConfig model)
- UniSpeechConfig configuration class: UniSpeechForCTC (UniSpeechConfig model)
- UniSpeechSatConfig configuration class: UniSpeechSatForCTC (UniSpeechSatConfig model)
- Wav2Vec2BertConfig configuration class: Wav2Vec2BertForCTC (Wav2Vec2BertConfig model)
- Wav2Vec2Config configuration class: Wav2Vec2ForCTC (Wav2Vec2Config model)
- Wav2Vec2ConformerConfig configuration class: Wav2Vec2ConformerForCTC (Wav2Vec2ConformerConfig model)
- WavLMConfig configuration class: WavLMForCTC (WavLMConfig model)
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the model classes of the library (with a connectionist temporal classification head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the model classes of the library (with a connectionist temporal classification head) from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- data2vec-audio — Data2VecAudioForCTC (Data2VecAudioConfig model)
- hubert — HubertForCTC (HubertConfig model)
- lasr_ctc — LasrForCTC (LasrCTCConfig model)
- parakeet_ctc — ParakeetForCTC (ParakeetCTCConfig model)
- sew — SEWForCTC (SEWConfig model)
- sew-d — SEWDForCTC (SEWDConfig model)
- unispeech — UniSpeechForCTC (UniSpeechConfig model)
- unispeech-sat — UniSpeechSatForCTC (UniSpeechSatConfig model)
- wav2vec2 — Wav2Vec2ForCTC (Wav2Vec2Config model)
- wav2vec2-bert — Wav2Vec2BertForCTC (Wav2Vec2BertConfig model)
- wav2vec2-conformer — Wav2Vec2ConformerForCTC (Wav2Vec2ConformerConfig model)
- wavlm — WavLMForCTC (WavLMConfig model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModelForCTC
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForCTC.from_pretrained("google-bert/bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForCTC.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
TrueAutoModelForSpeechSeq2Seq
This is a generic model class that will be instantiated as one of the model classes of the library (with a sequence-to-sequence speech-to-text modeling head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- CohereAsrConfig configuration class: CohereAsrForConditionalGeneration (CohereAsrConfig model)
- DiaConfig configuration class: DiaForConditionalGeneration (DiaConfig model)
- GraniteSpeechConfig configuration class: GraniteSpeechForConditionalGeneration (GraniteSpeechConfig model)
- KyutaiSpeechToTextConfig configuration class: KyutaiSpeechToTextForConditionalGeneration (KyutaiSpeechToTextConfig model)
- MoonshineConfig configuration class: MoonshineForConditionalGeneration (MoonshineConfig model)
- MoonshineStreamingConfig configuration class: MoonshineStreamingForConditionalGeneration (MoonshineStreamingConfig model)
- Pop2PianoConfig configuration class: Pop2PianoForConditionalGeneration (Pop2PianoConfig model)
- SeamlessM4TConfig configuration class: SeamlessM4TForSpeechToText (SeamlessM4TConfig model)
- SeamlessM4Tv2Config configuration class: SeamlessM4Tv2ForSpeechToText (SeamlessM4Tv2Config model)
- Speech2TextConfig configuration class: Speech2TextForConditionalGeneration (Speech2TextConfig model)
- SpeechEncoderDecoderConfig configuration class: SpeechEncoderDecoderModel (SpeechEncoderDecoderConfig model)
- SpeechT5Config configuration class: SpeechT5ForSpeechToText (SpeechT5Config model)
- VibeVoiceAsrConfig configuration class: VibeVoiceAsrForConditionalGeneration (VibeVoiceAsrConfig model)
- VoxtralConfig configuration class: VoxtralForConditionalGeneration (VoxtralConfig model)
- VoxtralRealtimeConfig configuration class: VoxtralRealtimeForConditionalGeneration (VoxtralRealtimeConfig model)
- WhisperConfig configuration class: WhisperForConditionalGeneration (WhisperConfig model)
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the model classes of the library (with a sequence-to-sequence speech-to-text modeling head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the model classes of the library (with a sequence-to-sequence speech-to-text modeling head) from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- cohere_asr — CohereAsrForConditionalGeneration (CohereAsrConfig model)
- dia — DiaForConditionalGeneration (DiaConfig model)
- granite_speech — GraniteSpeechForConditionalGeneration (GraniteSpeechConfig model)
- kyutai_speech_to_text — KyutaiSpeechToTextForConditionalGeneration (KyutaiSpeechToTextConfig model)
- moonshine — MoonshineForConditionalGeneration (MoonshineConfig model)
- moonshine_streaming — MoonshineStreamingForConditionalGeneration (MoonshineStreamingConfig model)
- pop2piano — Pop2PianoForConditionalGeneration (Pop2PianoConfig model)
- seamless_m4t — SeamlessM4TForSpeechToText (SeamlessM4TConfig model)
- seamless_m4t_v2 — SeamlessM4Tv2ForSpeechToText (SeamlessM4Tv2Config model)
- speech-encoder-decoder — SpeechEncoderDecoderModel (SpeechEncoderDecoderConfig model)
- speech_to_text — Speech2TextForConditionalGeneration (Speech2TextConfig model)
- speecht5 — SpeechT5ForSpeechToText (SpeechT5Config model)
- vibevoice_asr — VibeVoiceAsrForConditionalGeneration (VibeVoiceAsrConfig model)
- voxtral — VoxtralForConditionalGeneration (VoxtralConfig model)
- voxtral_realtime — VoxtralRealtimeForConditionalGeneration (VoxtralRealtimeConfig model)
- whisper — WhisperForConditionalGeneration (WhisperConfig model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModelForSpeechSeq2Seq
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForSpeechSeq2Seq.from_pretrained("google-bert/bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForSpeechSeq2Seq.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
TrueAutoModelForAudioXVector
This is a generic model class that will be instantiated as one of the model classes of the library (with a audio retrieval via x-vector head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- Data2VecAudioConfig configuration class: Data2VecAudioForXVector (Data2VecAudioConfig model)
- UniSpeechSatConfig configuration class: UniSpeechSatForXVector (UniSpeechSatConfig model)
- Wav2Vec2BertConfig configuration class: Wav2Vec2BertForXVector (Wav2Vec2BertConfig model)
- Wav2Vec2Config configuration class: Wav2Vec2ForXVector (Wav2Vec2Config model)
- Wav2Vec2ConformerConfig configuration class: Wav2Vec2ConformerForXVector (Wav2Vec2ConformerConfig model)
- WavLMConfig configuration class: WavLMForXVector (WavLMConfig model)
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the model classes of the library (with a audio retrieval via x-vector head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the model classes of the library (with a audio retrieval via x-vector head) from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- data2vec-audio — Data2VecAudioForXVector (Data2VecAudioConfig model)
- unispeech-sat — UniSpeechSatForXVector (UniSpeechSatConfig model)
- wav2vec2 — Wav2Vec2ForXVector (Wav2Vec2Config model)
- wav2vec2-bert — Wav2Vec2BertForXVector (Wav2Vec2BertConfig model)
- wav2vec2-conformer — Wav2Vec2ConformerForXVector (Wav2Vec2ConformerConfig model)
- wavlm — WavLMForXVector (WavLMConfig model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModelForAudioXVector
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForAudioXVector.from_pretrained("google-bert/bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForAudioXVector.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
TrueAutoModelForTextToSpectrogram
AutoModelForTextToWaveform
AutoModelForAudioTokenization
This is a generic model class that will be instantiated as one of the model classes of the library (with a audio tokenization through codebooks head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- DacConfig configuration class: DacModel (DacConfig model)
- HiggsAudioV2TokenizerConfig configuration class: HiggsAudioV2TokenizerModel (HiggsAudioV2TokenizerConfig model)
- VibeVoiceAcousticTokenizerConfig configuration class: VibeVoiceAcousticTokenizerModel (VibeVoiceAcousticTokenizerConfig model)
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the model classes of the library (with a audio tokenization through codebooks head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the model classes of the library (with a audio tokenization through codebooks head) from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- dac — DacModel (DacConfig model)
- higgs_audio_v2_tokenizer — HiggsAudioV2TokenizerModel (HiggsAudioV2TokenizerConfig model)
- vibevoice_acoustic_tokenizer — VibeVoiceAcousticTokenizerModel (VibeVoiceAcousticTokenizerConfig model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModelForAudioTokenization
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForAudioTokenization.from_pretrained("google-bert/bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForAudioTokenization.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
TrueMultimodal
The following auto classes are available for the following multimodal tasks.
AutoModelForMultimodalLM
This is a generic model class that will be instantiated as one of the model classes of the library (with a multimodal generation head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- AriaConfig configuration class: AriaForConditionalGeneration (AriaConfig model)
- AyaVisionConfig configuration class: AyaVisionForConditionalGeneration (AyaVisionConfig model)
- Blip2Config configuration class: Blip2ForConditionalGeneration (Blip2Config model)
- BlipConfig configuration class: BlipForConditionalGeneration (BlipConfig model)
- ChameleonConfig configuration class: ChameleonForConditionalGeneration (ChameleonConfig model)
- Cohere2VisionConfig configuration class: Cohere2VisionForConditionalGeneration (Cohere2VisionConfig model)
- DeepseekVLConfig configuration class: DeepseekVLForConditionalGeneration (DeepseekVLConfig model)
- DeepseekVLHybridConfig configuration class: DeepseekVLHybridForConditionalGeneration (DeepseekVLHybridConfig model)
- Emu3Config configuration class: Emu3ForConditionalGeneration (Emu3Config model)
- Ernie4_5_VLMoeConfig configuration class: Ernie4_5_VLMoeForConditionalGeneration (Ernie4_5_VLMoeConfig model)
- EvollaConfig configuration class: EvollaForProteinText2Text (EvollaConfig model)
- FastVlmConfig configuration class: FastVlmForConditionalGeneration (FastVlmConfig model)
- Florence2Config configuration class: Florence2ForConditionalGeneration (Florence2Config model)
- FuyuConfig configuration class: FuyuForCausalLM (FuyuConfig model)
- Gemma3Config configuration class: Gemma3ForConditionalGeneration (Gemma3Config model)
- Gemma3nConfig configuration class: Gemma3nForConditionalGeneration (Gemma3nConfig model)
- Gemma4Config configuration class: Gemma4ForConditionalGeneration (Gemma4Config model)
- GitConfig configuration class: GitForCausalLM (GitConfig model)
- Glm46VConfig configuration class: Glm46VForConditionalGeneration (Glm46VConfig model)
- Glm4vConfig configuration class: Glm4vForConditionalGeneration (Glm4vConfig model)
- Glm4vMoeConfig configuration class: Glm4vMoeForConditionalGeneration (Glm4vMoeConfig model)
- GlmAsrConfig configuration class: GlmAsrForConditionalGeneration (GlmAsrConfig model)
- GlmOcrConfig configuration class: GlmOcrForConditionalGeneration (GlmOcrConfig model)
- GotOcr2Config configuration class: GotOcr2ForConditionalGeneration (GotOcr2Config model)
- GraniteSpeechConfig configuration class: GraniteSpeechForConditionalGeneration (GraniteSpeechConfig model)
- Idefics2Config configuration class: Idefics2ForConditionalGeneration (Idefics2Config model)
- Idefics3Config configuration class: Idefics3ForConditionalGeneration (Idefics3Config model)
- IdeficsConfig configuration class: IdeficsForVisionText2Text (IdeficsConfig model)
- InstructBlipConfig configuration class: InstructBlipForConditionalGeneration (InstructBlipConfig model)
- InstructBlipVideoConfig configuration class: InstructBlipVideoForConditionalGeneration (InstructBlipVideoConfig model)
- InternVLConfig configuration class: InternVLForConditionalGeneration (InternVLConfig model)
- JanusConfig configuration class: JanusForConditionalGeneration (JanusConfig model)
- Kosmos2Config configuration class: Kosmos2ForConditionalGeneration (Kosmos2Config model)
- Kosmos2_5Config configuration class: Kosmos2_5ForConditionalGeneration (Kosmos2_5Config model)
- KyutaiSpeechToTextConfig configuration class: KyutaiSpeechToTextForConditionalGeneration (KyutaiSpeechToTextConfig model)
- Lfm2VlConfig configuration class: Lfm2VlForConditionalGeneration (Lfm2VlConfig model)
- LightOnOcrConfig configuration class: LightOnOcrForConditionalGeneration (LightOnOcrConfig model)
- Llama4Config configuration class: Llama4ForConditionalGeneration (Llama4Config model)
- LlavaConfig configuration class: LlavaForConditionalGeneration (LlavaConfig model)
- LlavaNextConfig configuration class: LlavaNextForConditionalGeneration (LlavaNextConfig model)
- LlavaNextVideoConfig configuration class: LlavaNextVideoForConditionalGeneration (LlavaNextVideoConfig model)
- LlavaOnevisionConfig configuration class: LlavaOnevisionForConditionalGeneration (LlavaOnevisionConfig model)
- Mistral3Config configuration class: Mistral3ForConditionalGeneration (Mistral3Config model)
- Mistral4Config configuration class: Mistral4ForCausalLM (Mistral4Config model)
- MllamaConfig configuration class: MllamaForConditionalGeneration (MllamaConfig model)
- Ovis2Config configuration class: Ovis2ForConditionalGeneration (Ovis2Config model)
- PI0Config configuration class: PI0ForConditionalGeneration (PI0Config model)
- PPChart2TableConfig configuration class: GotOcr2ForConditionalGeneration (PPChart2TableConfig model)
- PaddleOCRVLConfig configuration class: PaddleOCRVLForConditionalGeneration (PaddleOCRVLConfig model)
- PaliGemmaConfig configuration class: PaliGemmaForConditionalGeneration (PaliGemmaConfig model)
- PerceptionLMConfig configuration class: PerceptionLMForConditionalGeneration (PerceptionLMConfig model)
- Phi4MultimodalConfig configuration class: Phi4MultimodalForCausalLM (Phi4MultimodalConfig model)
- Pix2StructConfig configuration class: Pix2StructForConditionalGeneration (Pix2StructConfig model)
- Qwen2AudioConfig configuration class: Qwen2AudioForConditionalGeneration (Qwen2AudioConfig model)
- Qwen2VLConfig configuration class: Qwen2VLForConditionalGeneration (Qwen2VLConfig model)
- Qwen2_5OmniConfig configuration class: Qwen2_5OmniForConditionalGeneration (Qwen2_5OmniConfig model)
- Qwen2_5_VLConfig configuration class: Qwen2_5_VLForConditionalGeneration (Qwen2_5_VLConfig model)
- Qwen3OmniMoeConfig configuration class: Qwen3OmniMoeForConditionalGeneration (Qwen3OmniMoeConfig model)
- Qwen3VLConfig configuration class: Qwen3VLForConditionalGeneration (Qwen3VLConfig model)
- Qwen3VLMoeConfig configuration class: Qwen3VLMoeForConditionalGeneration (Qwen3VLMoeConfig model)
- Qwen3_5Config configuration class: Qwen3_5ForConditionalGeneration (Qwen3_5Config model)
- Qwen3_5MoeConfig configuration class: Qwen3_5MoeForConditionalGeneration (Qwen3_5MoeConfig model)
- ShieldGemma2Config configuration class: Gemma3ForConditionalGeneration (ShieldGemma2Config model)
- SmolVLMConfig configuration class: SmolVLMForConditionalGeneration (SmolVLMConfig model)
- T5Gemma2Config configuration class: T5Gemma2ForConditionalGeneration (T5Gemma2Config model)
- UdopConfig configuration class: UdopForConditionalGeneration (UdopConfig model)
- VibeVoiceAsrConfig configuration class: VibeVoiceAsrForConditionalGeneration (VibeVoiceAsrConfig model)
- VideoLlama3Config configuration class: VideoLlama3ForConditionalGeneration (VideoLlama3Config model)
- VideoLlavaConfig configuration class: VideoLlavaForConditionalGeneration (VideoLlavaConfig model)
- VipLlavaConfig configuration class: VipLlavaForConditionalGeneration (VipLlavaConfig model)
- VisionEncoderDecoderConfig configuration class: VisionEncoderDecoderModel (VisionEncoderDecoderConfig model)
- VoxtralConfig configuration class: VoxtralForConditionalGeneration (VoxtralConfig model)
- VoxtralRealtimeConfig configuration class: VoxtralRealtimeForConditionalGeneration (VoxtralRealtimeConfig model)
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the model classes of the library (with a multimodal generation head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the model classes of the library (with a multimodal generation head) from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- aria — AriaForConditionalGeneration (AriaConfig model)
- aya_vision — AyaVisionForConditionalGeneration (AyaVisionConfig model)
- blip — BlipForConditionalGeneration (BlipConfig model)
- blip-2 — Blip2ForConditionalGeneration (Blip2Config model)
- chameleon — ChameleonForConditionalGeneration (ChameleonConfig model)
- cohere2_vision — Cohere2VisionForConditionalGeneration (Cohere2VisionConfig model)
- deepseek_vl — DeepseekVLForConditionalGeneration (DeepseekVLConfig model)
- deepseek_vl_hybrid — DeepseekVLHybridForConditionalGeneration (DeepseekVLHybridConfig model)
- emu3 — Emu3ForConditionalGeneration (Emu3Config model)
- ernie4_5_vl_moe — Ernie4_5_VLMoeForConditionalGeneration (Ernie4_5_VLMoeConfig model)
- evolla — EvollaForProteinText2Text (EvollaConfig model)
- fast_vlm — FastVlmForConditionalGeneration (FastVlmConfig model)
- florence2 — Florence2ForConditionalGeneration (Florence2Config model)
- fuyu — FuyuForCausalLM (FuyuConfig model)
- gemma3 — Gemma3ForConditionalGeneration (Gemma3Config model)
- gemma3n — Gemma3nForConditionalGeneration (Gemma3nConfig model)
- gemma4 — Gemma4ForConditionalGeneration (Gemma4Config model)
- git — GitForCausalLM (GitConfig model)
- glm46v — Glm46VForConditionalGeneration (Glm46VConfig model)
- glm4v — Glm4vForConditionalGeneration (Glm4vConfig model)
- glm4v_moe — Glm4vMoeForConditionalGeneration (Glm4vMoeConfig model)
- glm_ocr — GlmOcrForConditionalGeneration (GlmOcrConfig model)
- glmasr — GlmAsrForConditionalGeneration (GlmAsrConfig model)
- got_ocr2 — GotOcr2ForConditionalGeneration (GotOcr2Config model)
- granite_speech — GraniteSpeechForConditionalGeneration (GraniteSpeechConfig model)
- idefics — IdeficsForVisionText2Text (IdeficsConfig model)
- idefics2 — Idefics2ForConditionalGeneration (Idefics2Config model)
- idefics3 — Idefics3ForConditionalGeneration (Idefics3Config model)
- instructblip — InstructBlipForConditionalGeneration (InstructBlipConfig model)
- instructblipvideo — InstructBlipVideoForConditionalGeneration (InstructBlipVideoConfig model)
- internvl — InternVLForConditionalGeneration (InternVLConfig model)
- janus — JanusForConditionalGeneration (JanusConfig model)
- kosmos-2 — Kosmos2ForConditionalGeneration (Kosmos2Config model)
- kosmos-2.5 — Kosmos2_5ForConditionalGeneration (Kosmos2_5Config model)
- kyutai_speech_to_text — KyutaiSpeechToTextForConditionalGeneration (KyutaiSpeechToTextConfig model)
- lfm2_vl — Lfm2VlForConditionalGeneration (Lfm2VlConfig model)
- lighton_ocr — LightOnOcrForConditionalGeneration (LightOnOcrConfig model)
- llama4 — Llama4ForConditionalGeneration (Llama4Config model)
- llava — LlavaForConditionalGeneration (LlavaConfig model)
- llava_next — LlavaNextForConditionalGeneration (LlavaNextConfig model)
- llava_next_video — LlavaNextVideoForConditionalGeneration (LlavaNextVideoConfig model)
- llava_onevision — LlavaOnevisionForConditionalGeneration (LlavaOnevisionConfig model)
- mistral3 — Mistral3ForConditionalGeneration (Mistral3Config model)
- mistral4 — Mistral4ForCausalLM (Mistral4Config model)
- mllama — MllamaForConditionalGeneration (MllamaConfig model)
- ovis2 — Ovis2ForConditionalGeneration (Ovis2Config model)
- paddleocr_vl — PaddleOCRVLForConditionalGeneration (PaddleOCRVLConfig model)
- paligemma — PaliGemmaForConditionalGeneration (PaliGemmaConfig model)
- perception_lm — PerceptionLMForConditionalGeneration (PerceptionLMConfig model)
- phi4_multimodal — Phi4MultimodalForCausalLM (Phi4MultimodalConfig model)
- pi0 — PI0ForConditionalGeneration (PI0Config model)
- pix2struct — Pix2StructForConditionalGeneration (Pix2StructConfig model)
- pp_chart2table — GotOcr2ForConditionalGeneration (PPChart2TableConfig model)
- qwen2_5_omni — Qwen2_5OmniForConditionalGeneration (Qwen2_5OmniConfig model)
- qwen2_5_vl — Qwen2_5_VLForConditionalGeneration (Qwen2_5_VLConfig model)
- qwen2_audio — Qwen2AudioForConditionalGeneration (Qwen2AudioConfig model)
- qwen2_vl — Qwen2VLForConditionalGeneration (Qwen2VLConfig model)
- qwen3_5 — Qwen3_5ForConditionalGeneration (Qwen3_5Config model)
- qwen3_5_moe — Qwen3_5MoeForConditionalGeneration (Qwen3_5MoeConfig model)
- qwen3_omni_moe — Qwen3OmniMoeForConditionalGeneration (Qwen3OmniMoeConfig model)
- qwen3_vl — Qwen3VLForConditionalGeneration (Qwen3VLConfig model)
- qwen3_vl_moe — Qwen3VLMoeForConditionalGeneration (Qwen3VLMoeConfig model)
- shieldgemma2 — Gemma3ForConditionalGeneration (ShieldGemma2Config model)
- smolvlm — SmolVLMForConditionalGeneration (SmolVLMConfig model)
- t5gemma2 — T5Gemma2ForConditionalGeneration (T5Gemma2Config model)
- udop — UdopForConditionalGeneration (UdopConfig model)
- vibevoice_asr — VibeVoiceAsrForConditionalGeneration (VibeVoiceAsrConfig model)
- video_llama_3 — VideoLlama3ForConditionalGeneration (VideoLlama3Config model)
- video_llava — VideoLlavaForConditionalGeneration (VideoLlavaConfig model)
- vipllava — VipLlavaForConditionalGeneration (VipLlavaConfig model)
- vision-encoder-decoder — VisionEncoderDecoderModel (VisionEncoderDecoderConfig model)
- voxtral — VoxtralForConditionalGeneration (VoxtralConfig model)
- voxtral_realtime — VoxtralRealtimeForConditionalGeneration (VoxtralRealtimeConfig model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModelForMultimodalLM
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForMultimodalLM.from_pretrained("google-bert/bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForMultimodalLM.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
TrueAutoModelForTableQuestionAnswering
This is a generic model class that will be instantiated as one of the model classes of the library (with a table question answering head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- TapasConfig configuration class: TapasForQuestionAnswering (TapasConfig model)
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the model classes of the library (with a table question answering head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the model classes of the library (with a table question answering head) from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- tapas — TapasForQuestionAnswering (TapasConfig model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModelForTableQuestionAnswering
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForTableQuestionAnswering.from_pretrained("google/tapas-base-finetuned-wtq")
>>> # Update configuration during loading
>>> model = AutoModelForTableQuestionAnswering.from_pretrained("google/tapas-base-finetuned-wtq", output_attentions=True)
>>> model.config.output_attentions
TrueAutoModelForDocumentQuestionAnswering
This is a generic model class that will be instantiated as one of the model classes of the library (with a document question answering head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- LayoutLMConfig configuration class: LayoutLMForQuestionAnswering (LayoutLMConfig model)
- LayoutLMv2Config configuration class: LayoutLMv2ForQuestionAnswering (LayoutLMv2Config model)
- LayoutLMv3Config configuration class: LayoutLMv3ForQuestionAnswering (LayoutLMv3Config model)
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the model classes of the library (with a document question answering head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
Examples:
>>> from transformers import AutoConfig, AutoModelForDocumentQuestionAnswering
>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("impira/layoutlm-document-qa", revision="52e01b3")
>>> model = AutoModelForDocumentQuestionAnswering.from_config(config)from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the model classes of the library (with a document question answering head) from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- layoutlm — LayoutLMForQuestionAnswering (LayoutLMConfig model)
- layoutlmv2 — LayoutLMv2ForQuestionAnswering (LayoutLMv2Config model)
- layoutlmv3 — LayoutLMv3ForQuestionAnswering (LayoutLMv3Config model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModelForDocumentQuestionAnswering
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForDocumentQuestionAnswering.from_pretrained("impira/layoutlm-document-qa", revision="52e01b3")
>>> # Update configuration during loading
>>> model = AutoModelForDocumentQuestionAnswering.from_pretrained("impira/layoutlm-document-qa", revision="52e01b3", output_attentions=True)
>>> model.config.output_attentions
TrueAutoModelForVisualQuestionAnswering
This is a generic model class that will be instantiated as one of the model classes of the library (with a visual question answering head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- Blip2Config configuration class: Blip2ForConditionalGeneration (Blip2Config model)
- BlipConfig configuration class: BlipForQuestionAnswering (BlipConfig model)
- ViltConfig configuration class: ViltForQuestionAnswering (ViltConfig model)
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the model classes of the library (with a visual question answering head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the model classes of the library (with a visual question answering head) from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- blip — BlipForQuestionAnswering (BlipConfig model)
- blip-2 — Blip2ForConditionalGeneration (Blip2Config model)
- vilt — ViltForQuestionAnswering (ViltConfig model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModelForVisualQuestionAnswering
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForVisualQuestionAnswering.from_pretrained("dandelin/vilt-b32-finetuned-vqa")
>>> # Update configuration during loading
>>> model = AutoModelForVisualQuestionAnswering.from_pretrained("dandelin/vilt-b32-finetuned-vqa", output_attentions=True)
>>> model.config.output_attentions
TrueAutoModelForImageTextToText
This is a generic model class that will be instantiated as one of the model classes of the library (with a image-text-to-text modeling head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- AriaConfig configuration class: AriaForConditionalGeneration (AriaConfig model)
- AyaVisionConfig configuration class: AyaVisionForConditionalGeneration (AyaVisionConfig model)
- Blip2Config configuration class: Blip2ForConditionalGeneration (Blip2Config model)
- BlipConfig configuration class: BlipForConditionalGeneration (BlipConfig model)
- ChameleonConfig configuration class: ChameleonForConditionalGeneration (ChameleonConfig model)
- Cohere2VisionConfig configuration class: Cohere2VisionForConditionalGeneration (Cohere2VisionConfig model)
- DeepseekVLConfig configuration class: DeepseekVLForConditionalGeneration (DeepseekVLConfig model)
- DeepseekVLHybridConfig configuration class: DeepseekVLHybridForConditionalGeneration (DeepseekVLHybridConfig model)
- Emu3Config configuration class: Emu3ForConditionalGeneration (Emu3Config model)
- Ernie4_5_VLMoeConfig configuration class: Ernie4_5_VLMoeForConditionalGeneration (Ernie4_5_VLMoeConfig model)
- EvollaConfig configuration class: EvollaForProteinText2Text (EvollaConfig model)
- FastVlmConfig configuration class: FastVlmForConditionalGeneration (FastVlmConfig model)
- Florence2Config configuration class: Florence2ForConditionalGeneration (Florence2Config model)
- FuyuConfig configuration class: FuyuForCausalLM (FuyuConfig model)
- Gemma3Config configuration class: Gemma3ForConditionalGeneration (Gemma3Config model)
- Gemma3nConfig configuration class: Gemma3nForConditionalGeneration (Gemma3nConfig model)
- Gemma4Config configuration class: Gemma4ForConditionalGeneration (Gemma4Config model)
- GitConfig configuration class: GitForCausalLM (GitConfig model)
- Glm46VConfig configuration class: Glm46VForConditionalGeneration (Glm46VConfig model)
- Glm4vConfig configuration class: Glm4vForConditionalGeneration (Glm4vConfig model)
- Glm4vMoeConfig configuration class: Glm4vMoeForConditionalGeneration (Glm4vMoeConfig model)
- GlmOcrConfig configuration class: GlmOcrForConditionalGeneration (GlmOcrConfig model)
- GotOcr2Config configuration class: GotOcr2ForConditionalGeneration (GotOcr2Config model)
- Idefics2Config configuration class: Idefics2ForConditionalGeneration (Idefics2Config model)
- Idefics3Config configuration class: Idefics3ForConditionalGeneration (Idefics3Config model)
- IdeficsConfig configuration class: IdeficsForVisionText2Text (IdeficsConfig model)
- InstructBlipConfig configuration class: InstructBlipForConditionalGeneration (InstructBlipConfig model)
- InstructBlipVideoConfig configuration class: InstructBlipVideoForConditionalGeneration (InstructBlipVideoConfig model)
- InternVLConfig configuration class: InternVLForConditionalGeneration (InternVLConfig model)
- JanusConfig configuration class: JanusForConditionalGeneration (JanusConfig model)
- Kosmos2Config configuration class: Kosmos2ForConditionalGeneration (Kosmos2Config model)
- Kosmos2_5Config configuration class: Kosmos2_5ForConditionalGeneration (Kosmos2_5Config model)
- Lfm2VlConfig configuration class: Lfm2VlForConditionalGeneration (Lfm2VlConfig model)
- LightOnOcrConfig configuration class: LightOnOcrForConditionalGeneration (LightOnOcrConfig model)
- Llama4Config configuration class: Llama4ForConditionalGeneration (Llama4Config model)
- LlavaConfig configuration class: LlavaForConditionalGeneration (LlavaConfig model)
- LlavaNextConfig configuration class: LlavaNextForConditionalGeneration (LlavaNextConfig model)
- LlavaNextVideoConfig configuration class: LlavaNextVideoForConditionalGeneration (LlavaNextVideoConfig model)
- LlavaOnevisionConfig configuration class: LlavaOnevisionForConditionalGeneration (LlavaOnevisionConfig model)
- Mistral3Config configuration class: Mistral3ForConditionalGeneration (Mistral3Config model)
- Mistral4Config configuration class: Mistral4ForCausalLM (Mistral4Config model)
- MllamaConfig configuration class: MllamaForConditionalGeneration (MllamaConfig model)
- Ovis2Config configuration class: Ovis2ForConditionalGeneration (Ovis2Config model)
- PI0Config configuration class: PI0ForConditionalGeneration (PI0Config model)
- PPChart2TableConfig configuration class: GotOcr2ForConditionalGeneration (PPChart2TableConfig model)
- PaddleOCRVLConfig configuration class: PaddleOCRVLForConditionalGeneration (PaddleOCRVLConfig model)
- PaliGemmaConfig configuration class: PaliGemmaForConditionalGeneration (PaliGemmaConfig model)
- PerceptionLMConfig configuration class: PerceptionLMForConditionalGeneration (PerceptionLMConfig model)
- Pix2StructConfig configuration class: Pix2StructForConditionalGeneration (Pix2StructConfig model)
- Qwen2VLConfig configuration class: Qwen2VLForConditionalGeneration (Qwen2VLConfig model)
- Qwen2_5_VLConfig configuration class: Qwen2_5_VLForConditionalGeneration (Qwen2_5_VLConfig model)
- Qwen3VLConfig configuration class: Qwen3VLForConditionalGeneration (Qwen3VLConfig model)
- Qwen3VLMoeConfig configuration class: Qwen3VLMoeForConditionalGeneration (Qwen3VLMoeConfig model)
- Qwen3_5Config configuration class: Qwen3_5ForConditionalGeneration (Qwen3_5Config model)
- Qwen3_5MoeConfig configuration class: Qwen3_5MoeForConditionalGeneration (Qwen3_5MoeConfig model)
- ShieldGemma2Config configuration class: Gemma3ForConditionalGeneration (ShieldGemma2Config model)
- SmolVLMConfig configuration class: SmolVLMForConditionalGeneration (SmolVLMConfig model)
- T5Gemma2Config configuration class: T5Gemma2ForConditionalGeneration (T5Gemma2Config model)
- UdopConfig configuration class: UdopForConditionalGeneration (UdopConfig model)
- VideoLlama3Config configuration class: VideoLlama3ForConditionalGeneration (VideoLlama3Config model)
- VideoLlavaConfig configuration class: VideoLlavaForConditionalGeneration (VideoLlavaConfig model)
- VipLlavaConfig configuration class: VipLlavaForConditionalGeneration (VipLlavaConfig model)
- VisionEncoderDecoderConfig configuration class: VisionEncoderDecoderModel (VisionEncoderDecoderConfig model)
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the model classes of the library (with a image-text-to-text modeling head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the model classes of the library (with a image-text-to-text modeling head) from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- aria — AriaForConditionalGeneration (AriaConfig model)
- aya_vision — AyaVisionForConditionalGeneration (AyaVisionConfig model)
- blip — BlipForConditionalGeneration (BlipConfig model)
- blip-2 — Blip2ForConditionalGeneration (Blip2Config model)
- chameleon — ChameleonForConditionalGeneration (ChameleonConfig model)
- cohere2_vision — Cohere2VisionForConditionalGeneration (Cohere2VisionConfig model)
- deepseek_vl — DeepseekVLForConditionalGeneration (DeepseekVLConfig model)
- deepseek_vl_hybrid — DeepseekVLHybridForConditionalGeneration (DeepseekVLHybridConfig model)
- emu3 — Emu3ForConditionalGeneration (Emu3Config model)
- ernie4_5_vl_moe — Ernie4_5_VLMoeForConditionalGeneration (Ernie4_5_VLMoeConfig model)
- evolla — EvollaForProteinText2Text (EvollaConfig model)
- fast_vlm — FastVlmForConditionalGeneration (FastVlmConfig model)
- florence2 — Florence2ForConditionalGeneration (Florence2Config model)
- fuyu — FuyuForCausalLM (FuyuConfig model)
- gemma3 — Gemma3ForConditionalGeneration (Gemma3Config model)
- gemma3n — Gemma3nForConditionalGeneration (Gemma3nConfig model)
- gemma4 — Gemma4ForConditionalGeneration (Gemma4Config model)
- git — GitForCausalLM (GitConfig model)
- glm46v — Glm46VForConditionalGeneration (Glm46VConfig model)
- glm4v — Glm4vForConditionalGeneration (Glm4vConfig model)
- glm4v_moe — Glm4vMoeForConditionalGeneration (Glm4vMoeConfig model)
- glm_ocr — GlmOcrForConditionalGeneration (GlmOcrConfig model)
- got_ocr2 — GotOcr2ForConditionalGeneration (GotOcr2Config model)
- idefics — IdeficsForVisionText2Text (IdeficsConfig model)
- idefics2 — Idefics2ForConditionalGeneration (Idefics2Config model)
- idefics3 — Idefics3ForConditionalGeneration (Idefics3Config model)
- instructblip — InstructBlipForConditionalGeneration (InstructBlipConfig model)
- instructblipvideo — InstructBlipVideoForConditionalGeneration (InstructBlipVideoConfig model)
- internvl — InternVLForConditionalGeneration (InternVLConfig model)
- janus — JanusForConditionalGeneration (JanusConfig model)
- kosmos-2 — Kosmos2ForConditionalGeneration (Kosmos2Config model)
- kosmos-2.5 — Kosmos2_5ForConditionalGeneration (Kosmos2_5Config model)
- lfm2_vl — Lfm2VlForConditionalGeneration (Lfm2VlConfig model)
- lighton_ocr — LightOnOcrForConditionalGeneration (LightOnOcrConfig model)
- llama4 — Llama4ForConditionalGeneration (Llama4Config model)
- llava — LlavaForConditionalGeneration (LlavaConfig model)
- llava_next — LlavaNextForConditionalGeneration (LlavaNextConfig model)
- llava_next_video — LlavaNextVideoForConditionalGeneration (LlavaNextVideoConfig model)
- llava_onevision — LlavaOnevisionForConditionalGeneration (LlavaOnevisionConfig model)
- mistral3 — Mistral3ForConditionalGeneration (Mistral3Config model)
- mistral4 — Mistral4ForCausalLM (Mistral4Config model)
- mllama — MllamaForConditionalGeneration (MllamaConfig model)
- ovis2 — Ovis2ForConditionalGeneration (Ovis2Config model)
- paddleocr_vl — PaddleOCRVLForConditionalGeneration (PaddleOCRVLConfig model)
- paligemma — PaliGemmaForConditionalGeneration (PaliGemmaConfig model)
- perception_lm — PerceptionLMForConditionalGeneration (PerceptionLMConfig model)
- pi0 — PI0ForConditionalGeneration (PI0Config model)
- pix2struct — Pix2StructForConditionalGeneration (Pix2StructConfig model)
- pp_chart2table — GotOcr2ForConditionalGeneration (PPChart2TableConfig model)
- qwen2_5_vl — Qwen2_5_VLForConditionalGeneration (Qwen2_5_VLConfig model)
- qwen2_vl — Qwen2VLForConditionalGeneration (Qwen2VLConfig model)
- qwen3_5 — Qwen3_5ForConditionalGeneration (Qwen3_5Config model)
- qwen3_5_moe — Qwen3_5MoeForConditionalGeneration (Qwen3_5MoeConfig model)
- qwen3_vl — Qwen3VLForConditionalGeneration (Qwen3VLConfig model)
- qwen3_vl_moe — Qwen3VLMoeForConditionalGeneration (Qwen3VLMoeConfig model)
- shieldgemma2 — Gemma3ForConditionalGeneration (ShieldGemma2Config model)
- smolvlm — SmolVLMForConditionalGeneration (SmolVLMConfig model)
- t5gemma2 — T5Gemma2ForConditionalGeneration (T5Gemma2Config model)
- udop — UdopForConditionalGeneration (UdopConfig model)
- video_llama_3 — VideoLlama3ForConditionalGeneration (VideoLlama3Config model)
- video_llava — VideoLlavaForConditionalGeneration (VideoLlavaConfig model)
- vipllava — VipLlavaForConditionalGeneration (VipLlavaConfig model)
- vision-encoder-decoder — VisionEncoderDecoderModel (VisionEncoderDecoderConfig model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModelForImageTextToText
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForImageTextToText.from_pretrained("google-bert/bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForImageTextToText.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
TrueTime Series
AutoModelForTimeSeriesPrediction
This is a generic model class that will be instantiated as one of the model classes of the library (with a time-series prediction head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__() (throws an error).
from_config
< source >( **kwargs )
Parameters
- config (PreTrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- TimesFm2_5Config configuration class: TimesFm2_5ModelForPrediction (TimesFm2_5Config model)
- TimesFmConfig configuration class: TimesFmModelForPrediction (TimesFmConfig model)
- attn_implementation (
str, optional) — The attention implementation to use in the model (if relevant). Can be any of"eager"(manual implementation of the attention),"sdpa"(usingF.scaled_dot_product_attention),"flash_attention_2"(using Dao-AILab/flash-attention), or"flash_attention_3"(using Dao-AILab/flash-attention/hopper). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual"eager"implementation.
Instantiates one of the model classes of the library (with a time-series prediction head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
from_pretrained
< source >( *model_args **kwargs )
Parameters
- pretrained_model_name_or_path (
stroros.PathLike) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on huggingface.co.
- A path to a directory containing model weights saved using
save_pretrained(), e.g.,
./my_model_directory/.
- model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
__init__()method. - config (PreTrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
pretrained_model_name_or_pathand a configuration JSON file named config.json is found in the directory.
- state_dict (dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
- cache_dir (
stroros.PathLike, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. - force_download (
bool, optional, defaults toFalse) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. - proxies (
dict[str, str], optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}. The proxies are used on each request. - output_loading_info(
bool, optional, defaults toFalse) — Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages. - local_files_only(
bool, optional, defaults toFalse) — Whether or not to only look at local files (e.g., not try downloading the model). - revision (
str, optional, defaults to"main") — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - trust_remote_code (
bool, optional, defaults toFalse) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. - code_revision (
str, optional, defaults to"main") — The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git. - kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
output_attentions=True). Behaves differently depending on whether aconfigis provided or automatically loaded:- If a configuration is provided with
config,**kwargswill be directly passed to the underlying model’s__init__method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
kwargswill be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargsthat corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargsvalue. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__function.
- If a configuration is provided with
Instantiate one of the model classes of the library (with a time-series prediction head) from a pretrained model.
The model class to instantiate is selected based on the model_type property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path:
- timesfm — TimesFmModelForPrediction (TimesFmConfig model)
- timesfm2_5 — TimesFm2_5ModelForPrediction (TimesFm2_5Config model)
The model is set in evaluation mode by default using model.eval() (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
Examples:
>>> from transformers import AutoConfig, AutoModelForTimeSeriesPrediction
>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForTimeSeriesPrediction.from_pretrained("google-bert/bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForTimeSeriesPrediction.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True