Instructions to use Tanneru/CodeLlama-7b-Python-hf-ft with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use Tanneru/CodeLlama-7b-Python-hf-ft with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("codellama/CodeLlama-7b-Python-hf") model = PeftModel.from_pretrained(base_model, "Tanneru/CodeLlama-7b-Python-hf-ft") - Notebooks
- Google Colab
- Kaggle
| license: apache-2.0 | |
| datasets: | |
| - jtatman/python-code-dataset-500k | |
| metrics: | |
| - bleu | |
| - rouge | |
| - perplexity | |
| - chrf | |
| - codebertscore | |
| base_model: | |
| - codellama/CodeLlama-7b-Python-hf | |
| pipeline_tag: text-generation | |
| tags: | |
| - code | |
| - python | |
| - codellama | |
| - lora | |
| - peft | |
| - sft | |
| - programming | |
| # CodeLlama-7b-Python-hf-ft | |
| This repository contains a **LoRA fine-tuned adapter** for **[CodeLlama-7b-Python-hf](https://huggingface.co/codellama/CodeLlama-7b-Python-hf)**, trained to improve **Python instruction-following and code generation**. | |
| **Note:** | |
| This is a **PEFT LoRA adapter**, not a fully merged standalone model. You must load it on top of the base model. | |
| --- | |
| ## Model Details | |
| - **Base model**: [codellama/CodeLlama-7b-Python-hf](https://huggingface.co/codellama/CodeLlama-7b-Python-hf) | |
| - **Fine-tuned for**: Python instruction-following and code generation | |
| - **Fine-tuning method**: SFT + LoRA (PEFT) | |
| - **Framework**: Transformers + PEFT + TRL | |
| --- | |
| ## Dataset Used | |
| This adapter was fine-tuned on: | |
| 1. [jtatman/python-code-dataset-500k](https://huggingface.co/datasets/jtatman/python-code-dataset-500k) | |
| - Large-scale Python instruction → solution pairs | |
| - Parquet format (~500k+ examples) | |
| --- | |
| ## Training Configuration | |
| ### LoRA Configuration | |
| - **r:** 32 | |
| - **lora_alpha:** 16 | |
| - **Target modules:** | |
| `q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj` | |
| ### SFT Configuration | |
| - **Epochs:** 1 | |
| - **Learning rate:** 2e-4 | |
| - **Scheduler:** cosine | |
| - **Warmup ratio:** 0.03 | |
| - **Weight decay:** 0.0 | |
| - **Train batch size:** 4 | |
| - **Eval batch size:** 4 | |
| - **Gradient accumulation steps:** 16 | |
| - **Precision:** bf16 | |
| - **Attention:** flash_attention_2 | |
| - **Packing:** enabled | |
| - **Gradient checkpointing:** enabled | |
| - **Logging:** every 50 steps + per epoch | |
| - **Saving:** per epoch (`save_total_limit=2`) | |
| --- | |
| ## Evaluation Results | |
| The model was evaluated using both language-modeling metrics and generation-quality metrics. | |
| ### 📉 Perplexity / Loss | |
| - **Base model loss:** `1.3214` | |
| - **Base model perplexity:** `3.7486` | |
| - **Fine-tuned (LoRA) val/test loss:** `0.7126` | |
| - **Fine-tuned (LoRA) val/test perplexity:** `2.0394` | |
| ### 📊 Generation Quality Metrics (Test) | |
| - **Exact Match:** `0.0033` | |
| - **Normalized Exact Match:** `0.0033` | |
| - **BLEU:** `18.43` | |
| - **chrF:** `34.06` | |
| - **ROUGE-L (F1):** `0.2417` | |
| ### 🧠 CodeBERTScore (Mean) | |
| - **Precision:** `0.7187` | |
| - **Recall:** `0.7724` | |
| - **F1:** `0.7421` | |
| - **F3:** `0.7657` | |
| ### 🧾 Training Summary (from logs) | |
| - **Train loss:** `~0.6903` | |
| - **Eval loss:** `~0.6877` | |
| --- | |
| ## Example Usage | |
| ```python | |
| import torch | |
| from transformers import AutoModelForCausalLM, AutoTokenizer | |
| from peft import PeftModel | |
| # Base + adapter | |
| base_id = "codellama/CodeLlama-7b-Python-hf" | |
| adapter_id = "Tanneru/CodeLlama-7b-Python-hf-ft" | |
| # Load tokenizer (repo includes tokenizer files) | |
| tokenizer = AutoTokenizer.from_pretrained(adapter_id) | |
| # Load base model | |
| base_model = AutoModelForCausalLM.from_pretrained( | |
| base_id, | |
| device_map="auto", | |
| torch_dtype=torch.bfloat16 if torch.cuda.is_available() else torch.float32, | |
| ) | |
| # Load LoRA adapter | |
| model = PeftModel.from_pretrained(base_model, adapter_id) | |
| model.eval() | |
| prompt = "Write a Python function that checks if a number is prime." | |
| inputs = tokenizer(prompt, return_tensors="pt").to(model.device) | |
| with torch.inference_mode(): | |
| out = model.generate(**inputs, max_new_tokens=256) | |
| print(tokenizer.decode(out[0], skip_special_tokens=True)) | |
| ``` | |
| ## Citation | |
| If you use this model in your research or project, please cite it: | |
| ```bibtex | |
| @misc{tanneru2025codellamapythonft, | |
| title = {CodeLlama-7b-Python-hf-ft}, | |
| author = {Tanneru}, | |
| year = {2025}, | |
| publisher = {Hugging Face}, | |
| howpublished = {\url{https://huggingface.co/Tanneru/CodeLlama-7b-Python-hf-ft}} | |
| } | |
| ``` |