📖 Glosario

Términos clave explicados de forma breve. Usa el buscador o filtra por categoría.

Referencias

Papers fundacionales

  • Attention Is All You Need — Vaswani et al. (2017)
  • Language Models are Unsupervised Multitask Learners (GPT-2) — Radford et al. (2019)
  • Scaling Laws for Neural Language Models — Kaplan et al. (2020)
  • Training Compute-Optimal Large Language Models (Chinchilla) — Hoffmann et al. (2022)

Modelos abiertos

  • LLaMA: Open and Efficient Foundation Language Models — Meta (2023)
  • LLaMA 2: Open Foundation and Fine-Tuned Chat Models — Meta (2023)
  • The Llama 3 Herd of Models — Meta (2024)
  • DeepSeek-V2: A Strong, Economical, and Efficient MoE — DeepSeek (2024)
  • DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via RL — DeepSeek (2025)

Hardware e inferencia

  • FlashAttention: Fast and Memory-Efficient Exact Attention — Dao et al. (2022)
  • Efficient Memory Management for LLM Serving (vLLM) — Kwon et al. (2023)
  • llama.cpp — Gerganov et al. (2023-2026)
  • NVIDIA Ada GPU Architecture Whitepaper — NVIDIA (2022)
  • H100 Tensor Core GPU Architecture — NVIDIA (2022)

Alineamiento y alucinaciones

  • Training Language Models to Follow Instructions with Human Feedback (InstructGPT) — Ouyang et al. (2022)
  • Constitutional AI — Anthropic (2023)
  • GRPO: Group Relative Policy Optimization — DeepSeek (2025)
Documento v2 — Hermes — Junio 2026