Initial commit: kantine2ical CLI, Flask-Server, Docker
This commit is contained in:
6
.dockerignore
Normal file
6
.dockerignore
Normal file
@@ -0,0 +1,6 @@
|
||||
.venv
|
||||
__pycache__
|
||||
*.pyc
|
||||
.git
|
||||
*.ics
|
||||
.gitignore
|
||||
45
.gitignore
vendored
Normal file
45
.gitignore
vendored
Normal file
@@ -0,0 +1,45 @@
|
||||
# Python
|
||||
.venv/
|
||||
venv/
|
||||
__pycache__/
|
||||
*.py[cod]
|
||||
*$py.class
|
||||
*.so
|
||||
.Python
|
||||
build/
|
||||
develop-eggs/
|
||||
dist/
|
||||
downloads/
|
||||
eggs/
|
||||
.eggs/
|
||||
lib/
|
||||
lib64/
|
||||
parts/
|
||||
sdist/
|
||||
var/
|
||||
wheels/
|
||||
*.egg-info/
|
||||
.installed.cfg
|
||||
*.egg
|
||||
|
||||
# Generierte Ausgabe
|
||||
*.ics
|
||||
|
||||
# IDE / Editor
|
||||
.idea/
|
||||
.vscode/
|
||||
*.swp
|
||||
*.swo
|
||||
*~
|
||||
|
||||
# Umgebung
|
||||
.env
|
||||
.env.local
|
||||
*.local
|
||||
|
||||
# OS
|
||||
.DS_Store
|
||||
Thumbs.db
|
||||
|
||||
# Logs
|
||||
*.log
|
||||
15
Dockerfile
Normal file
15
Dockerfile
Normal file
@@ -0,0 +1,15 @@
|
||||
FROM python:3.12-slim
|
||||
|
||||
WORKDIR /app
|
||||
|
||||
COPY requirements.txt .
|
||||
RUN pip install --no-cache-dir -r requirements.txt
|
||||
|
||||
COPY kantine2ical.py app.py ./
|
||||
|
||||
RUN useradd -m appuser
|
||||
USER appuser
|
||||
|
||||
EXPOSE 8000
|
||||
|
||||
CMD ["gunicorn", "-w", "1", "-b", "0.0.0.0:8000", "app:app"]
|
||||
98
README.md
Normal file
98
README.md
Normal file
@@ -0,0 +1,98 @@
|
||||
# Kantine BHZ Kiel-Wik – Speiseplan zu iCal
|
||||
|
||||
Liest die PDF-Speisepläne von [kantine-bhz.de](http://kantine-bhz.de) aus und erzeugt eine iCal-Datei (`.ics`) für den Import in Google Kalender. Jeder Termin enthält alle Tagesgerichte (I. bis V.) und ist auf 12:00 Uhr mittags gesetzt.
|
||||
|
||||
## Voraussetzungen
|
||||
|
||||
- Python 3.9+ (für `zoneinfo`)
|
||||
- Bestehendes venv unter `.venv`
|
||||
|
||||
## Installation
|
||||
|
||||
1. Virtuelle Umgebung aktivieren:
|
||||
|
||||
**Windows (PowerShell):**
|
||||
```powershell
|
||||
.venv\Scripts\Activate.ps1
|
||||
```
|
||||
|
||||
**Windows (CMD):**
|
||||
```cmd
|
||||
.venv\Scripts\activate.bat
|
||||
```
|
||||
|
||||
2. Abhängigkeiten installieren:
|
||||
```bash
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
## Nutzung
|
||||
|
||||
```bash
|
||||
python kantine2ical.py
|
||||
```
|
||||
|
||||
Erzeugt standardmäßig die Datei `kantine_speiseplan.ics` im aktuellen Verzeichnis.
|
||||
|
||||
**Optionen:**
|
||||
|
||||
- `-o DATEI` / `--output DATEI` – Ausgabedatei (Standard: `kantine_speiseplan.ics`)
|
||||
- `--url URL` – Basis-URL der Kantine (Standard: `http://kantine-bhz.de`)
|
||||
|
||||
Beispiel:
|
||||
```bash
|
||||
python kantine2ical.py -o mein_speiseplan.ics
|
||||
```
|
||||
|
||||
## Google Kalender Import
|
||||
|
||||
1. [Google Kalender](https://calendar.google.com) öffnen
|
||||
2. Neben „Meine Kalender“ auf das Drei-Punkte-Menü klicken → **Import**
|
||||
3. Die erzeugte `.ics`-Datei auswählen und dem gewünschten Kalender zuordnen
|
||||
|
||||
Die Termine erscheinen mit der Zeitzone Europe/Berlin um 12:00 Uhr mit allen fünf Tagesgerichten in der Beschreibung.
|
||||
|
||||
---
|
||||
|
||||
## Server-Modus (abonnierbare URL)
|
||||
|
||||
Der Speiseplan kann als **externer Kalender** per URL angeboten werden. Ein Flask-Server liefert die iCal-Daten; Google Kalender und andere Clients können die URL direkt abonnieren. Im Hintergrund wird täglich nach neuen Speiseplan-PDFs gesucht und der Kalender aktualisiert.
|
||||
|
||||
**Voraussetzung:** Für „Von URL hinzufügen“ in Google Kalender muss die Server-URL von außen erreichbar sein (öffentliche IP, Reverse-Proxy oder z. B. ngrok für Tests).
|
||||
|
||||
1. Abhängigkeiten installieren (inkl. Flask): `pip install -r requirements.txt`
|
||||
2. Server starten:
|
||||
```bash
|
||||
python app.py
|
||||
```
|
||||
Oder mit Flask-CLI: `flask --app app run --host 0.0.0.0 --port 5000`
|
||||
3. Abo-URL für Google Kalender: `http://<host>:5000/calendar.ics` (bzw. Port 5000 durch Ihren Host/Port ersetzen).
|
||||
4. In Google Kalender: „Andere Kalender hinzufügen“ → „Von URL“ → obige URL eintragen.
|
||||
|
||||
**Konfiguration (optional, Umgebungsvariablen):**
|
||||
|
||||
- `KANTINE_BASE_URL` – Basis-URL der Kantine (Standard: `http://kantine-bhz.de`)
|
||||
- `REFRESH_INTERVAL_SECONDS` – Sekunden zwischen Aktualisierungen (Standard: 86400 = 24 h)
|
||||
|
||||
---
|
||||
|
||||
## Docker (Production)
|
||||
|
||||
Für den Betrieb als Container (z. B. auf einem Server):
|
||||
|
||||
**Build und Run:**
|
||||
```bash
|
||||
docker build -t kantine2ical .
|
||||
docker run -p 8000:8000 kantine2ical
|
||||
```
|
||||
|
||||
Abo-URL: `http://<host>:8000/calendar.ics`
|
||||
|
||||
**Mit Docker Compose:**
|
||||
```bash
|
||||
docker compose up -d
|
||||
```
|
||||
|
||||
Der Container läuft auf Port 8000 und startet bei Bedarf neu (`restart: unless-stopped`).
|
||||
|
||||
**Hinweis:** Für den Einsatz in Production wird ein **Reverse-Proxy mit HTTPS** (z. B. Traefik, Caddy oder nginx) vor dem Container empfohlen, damit Google die Kalender-URL zuverlässig abrufen kann.
|
||||
83
app.py
Normal file
83
app.py
Normal file
@@ -0,0 +1,83 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Flask-Server: Kantinen-Speiseplan als abonnierbare iCal-URL.
|
||||
Täglicher Hintergrund-Refresh lädt neue PDFs und aktualisiert den Kalender.
|
||||
"""
|
||||
|
||||
import logging
|
||||
import os
|
||||
import threading
|
||||
import time
|
||||
|
||||
from flask import Flask, Response
|
||||
|
||||
from kantine2ical import BASE_URL, empty_ical_bytes, refresh_speiseplan
|
||||
|
||||
# Konfiguration (Umgebungsvariablen mit Fallback)
|
||||
KANTINE_BASE_URL = os.environ.get("KANTINE_BASE_URL", BASE_URL)
|
||||
REFRESH_INTERVAL_SECONDS = int(os.environ.get("REFRESH_INTERVAL_SECONDS", "86400")) # 24h
|
||||
|
||||
app = Flask(__name__)
|
||||
_log = logging.getLogger(__name__)
|
||||
|
||||
# Cache: zuletzt gültige iCal-Bytes; Lock für Zugriff
|
||||
_ical_cache: bytes = empty_ical_bytes()
|
||||
_cache_lock = threading.Lock()
|
||||
|
||||
def _do_refresh() -> None:
|
||||
"""Refresh ausführen und bei Erfolg Cache aktualisieren."""
|
||||
global _ical_cache
|
||||
result = refresh_speiseplan(KANTINE_BASE_URL)
|
||||
if result is not None:
|
||||
_, ical_bytes = result
|
||||
with _cache_lock:
|
||||
_ical_cache = ical_bytes
|
||||
_log.info("Speiseplan-Refresh erfolgreich, Cache aktualisiert.")
|
||||
else:
|
||||
_log.warning("Speiseplan-Refresh fehlgeschlagen oder keine Daten; Cache unverändert.")
|
||||
|
||||
|
||||
def _refresh_loop() -> None:
|
||||
"""Hintergrund-Thread: alle REFRESH_INTERVAL_SECONDS einen Refresh ausführen."""
|
||||
while True:
|
||||
time.sleep(REFRESH_INTERVAL_SECONDS)
|
||||
try:
|
||||
_do_refresh()
|
||||
except Exception as e:
|
||||
_log.exception("Refresh-Thread: %s", e)
|
||||
|
||||
|
||||
# Beim Import: einmal Refresh, Hintergrund-Thread starten (gilt auch für Gunicorn)
|
||||
logging.basicConfig(level=logging.INFO)
|
||||
_log.info("Starte Speiseplan-Refresh beim Start ...")
|
||||
_do_refresh()
|
||||
_refresh_thread = threading.Thread(target=_refresh_loop, daemon=True)
|
||||
_refresh_thread.start()
|
||||
_log.info("Hintergrund-Refresh alle %s Sekunden.", REFRESH_INTERVAL_SECONDS)
|
||||
|
||||
|
||||
@app.route("/calendar.ics")
|
||||
def calendar_ics() -> Response:
|
||||
"""iCal-Kalender ausliefern (für Abo-URL z. B. in Google Kalender)."""
|
||||
with _cache_lock:
|
||||
data = _ical_cache
|
||||
return Response(
|
||||
data,
|
||||
mimetype="text/calendar; charset=utf-8",
|
||||
headers={"Content-Disposition": 'attachment; filename="kantine_speiseplan.ics"'},
|
||||
)
|
||||
|
||||
|
||||
@app.route("/")
|
||||
def index() -> Response:
|
||||
"""Redirect auf calendar.ics oder gleiche Antwort wie /calendar.ics."""
|
||||
return calendar_ics()
|
||||
|
||||
|
||||
def main() -> None:
|
||||
"""Flask-Entwicklungsserver starten (Refresh und Thread laufen bereits beim Import)."""
|
||||
app.run(host="0.0.0.0", port=5000)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
6
docker-compose.yml
Normal file
6
docker-compose.yml
Normal file
@@ -0,0 +1,6 @@
|
||||
services:
|
||||
kantine2ical:
|
||||
build: .
|
||||
ports:
|
||||
- "8000:8000"
|
||||
restart: unless-stopped
|
||||
249
kantine2ical.py
Normal file
249
kantine2ical.py
Normal file
@@ -0,0 +1,249 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Speiseplan von kantine-bhz.de aus PDFs auslesen und als iCal (.ics) exportieren.
|
||||
Termine täglich 12:00 mit allen Tagesgerichten (I. bis V.).
|
||||
"""
|
||||
|
||||
import argparse
|
||||
import io
|
||||
import re
|
||||
from datetime import date, datetime, time
|
||||
from zoneinfo import ZoneInfo
|
||||
from urllib.parse import urljoin
|
||||
|
||||
import pdfplumber
|
||||
import requests
|
||||
from bs4 import BeautifulSoup
|
||||
from icalendar import Calendar, Event
|
||||
|
||||
# Konstanten
|
||||
BASE_URL = "http://kantine-bhz.de"
|
||||
DEFAULT_OUTPUT = "kantine_speiseplan.ics"
|
||||
SUMMARY = "Kantine BHZ Kiel-Wik"
|
||||
TIMEZONE = "Europe/Berlin"
|
||||
EVENT_START = time(12, 0)
|
||||
EVENT_END = time(13, 0)
|
||||
|
||||
# Regex: Wochentag, den DD.MM.YYYY
|
||||
DATE_LINE_RE = re.compile(
|
||||
r"(Montag|Dienstag|Mittwoch|Donnerstag|Freitag),\s*den\s+(\d{2})\.(\d{2})\.(\d{4})",
|
||||
re.IGNORECASE,
|
||||
)
|
||||
# Zeilen die mit I., II., III., IV., V. beginnen (IV/V vor I/II/III prüfen)
|
||||
DISH_LINE_RE = re.compile(r"^\s*(IV\.?|V\.?|I{1,3}\.?)\s*(.*)$", re.IGNORECASE)
|
||||
|
||||
ROMAN_ORDER = {"I": 1, "II": 2, "III": 3, "IV": 4, "V": 5}
|
||||
|
||||
|
||||
def fetch_speiseplan_pdf_urls(base_url: str = BASE_URL) -> list[str]:
|
||||
"""Startseite laden und alle Speiseplan-PDF-Links sammeln (ohne Duplikate)."""
|
||||
resp = requests.get(base_url, timeout=30)
|
||||
resp.raise_for_status()
|
||||
resp.encoding = resp.apparent_encoding or "utf-8"
|
||||
soup = BeautifulSoup(resp.text, "html.parser")
|
||||
seen = set()
|
||||
urls = []
|
||||
for a in soup.find_all("a", href=True):
|
||||
href = a["href"].strip()
|
||||
if not href.lower().endswith(".pdf"):
|
||||
continue
|
||||
if "Speiseplan" not in href:
|
||||
continue
|
||||
full = urljoin(base_url, href)
|
||||
if full in seen:
|
||||
continue
|
||||
seen.add(full)
|
||||
urls.append(full)
|
||||
return urls
|
||||
|
||||
|
||||
def download_pdf(url: str) -> bytes:
|
||||
"""PDF von URL herunterladen."""
|
||||
resp = requests.get(url, timeout=30)
|
||||
resp.raise_for_status()
|
||||
return resp.content
|
||||
|
||||
|
||||
def extract_text_from_pdf(pdf_bytes: bytes) -> str:
|
||||
"""Text aus PDF mit pdfplumber extrahieren."""
|
||||
text_parts = []
|
||||
with pdfplumber.open(io.BytesIO(pdf_bytes)) as pdf:
|
||||
for page in pdf.pages:
|
||||
t = page.extract_text()
|
||||
if t:
|
||||
text_parts.append(t)
|
||||
return "\n".join(text_parts)
|
||||
|
||||
|
||||
def _normalize_roman(roman: str) -> int | None:
|
||||
r = roman.upper().rstrip(".")
|
||||
return ROMAN_ORDER.get(r)
|
||||
|
||||
|
||||
def parse_speiseplan_text(text: str) -> list[tuple[date, list[str]]]:
|
||||
"""
|
||||
Aus PDF-Text pro Tag (date) die fünf Gerichte I.–V. extrahieren.
|
||||
Gibt Liste von (date, [gericht_i, ..., gericht_v]) zurück.
|
||||
Fehlende Gerichte werden als "–" ergänzt.
|
||||
"""
|
||||
lines = [ln.strip() for ln in text.splitlines() if ln.strip()]
|
||||
result: list[tuple[date, list[str]]] = []
|
||||
i = 0
|
||||
while i < len(lines):
|
||||
m = DATE_LINE_RE.match(lines[i])
|
||||
if not m:
|
||||
i += 1
|
||||
continue
|
||||
day_name, d, mo, y = m.groups()
|
||||
try:
|
||||
dt = date(int(y), int(mo), int(d))
|
||||
except ValueError:
|
||||
i += 1
|
||||
continue
|
||||
dishes: dict[int, str] = {}
|
||||
i += 1
|
||||
while i < len(lines):
|
||||
line = lines[i]
|
||||
if DATE_LINE_RE.match(line):
|
||||
break
|
||||
if line.startswith("_" * 10) or "Öffnungszeiten" in line or "Speisen enthalten" in line:
|
||||
break
|
||||
dish_m = DISH_LINE_RE.match(line)
|
||||
if dish_m:
|
||||
roman_part, rest = dish_m.groups()
|
||||
idx = _normalize_roman(roman_part.strip().rstrip("."))
|
||||
if idx is not None and 1 <= idx <= 5 and idx not in dishes:
|
||||
dishes[idx] = f"{roman_part.strip()} {rest}".strip() if rest else roman_part.strip()
|
||||
i += 1
|
||||
# Immer genau 5 Einträge; fehlende mit "–"
|
||||
ordered = [dishes.get(j, "–") for j in range(1, 6)]
|
||||
result.append((dt, ordered))
|
||||
return result
|
||||
|
||||
|
||||
def merge_day_events(all_parsed: list[list[tuple[date, list[str]]]]) -> dict[date, list[str]]:
|
||||
"""Alle PDF-Ergebnisse zusammenführen; bei doppeltem Datum gewinnt letztes Vorkommen."""
|
||||
by_date: dict[date, list[str]] = {}
|
||||
for day_list in all_parsed:
|
||||
for d, dishes in day_list:
|
||||
by_date[d] = dishes
|
||||
return by_date
|
||||
|
||||
|
||||
def empty_ical_bytes() -> bytes:
|
||||
"""Minimalen leeren iCal-Kalender (ohne Events) als Bytes liefern."""
|
||||
cal = Calendar()
|
||||
cal.add("prodid", "-//Kantine BHZ Kiel-Wik Speiseplan//kantine2ical//DE")
|
||||
cal.add("version", "2.0")
|
||||
cal.add("CALSCALE", "GREGORIAN")
|
||||
return cal.to_ical()
|
||||
|
||||
|
||||
def build_ical_bytes(by_date: dict[date, list[str]]) -> bytes:
|
||||
"""iCal-Kalender als Bytes erzeugen (12:00, Europe/Berlin)."""
|
||||
cal = Calendar()
|
||||
cal.add("prodid", "-//Kantine BHZ Kiel-Wik Speiseplan//kantine2ical//DE")
|
||||
cal.add("version", "2.0")
|
||||
cal.add("CALSCALE", "GREGORIAN")
|
||||
|
||||
tz_berlin = ZoneInfo(TIMEZONE)
|
||||
|
||||
for d, dishes in sorted(by_date.items()):
|
||||
event = Event()
|
||||
event.add("uid", f"kantine-bhz-{d.isoformat()}@kantine2ical.local")
|
||||
event.add("summary", SUMMARY)
|
||||
desc = "\n".join(dishes)
|
||||
event.add("description", desc)
|
||||
start_dt = datetime.combine(d, EVENT_START, tzinfo=tz_berlin)
|
||||
end_dt = datetime.combine(d, EVENT_END, tzinfo=tz_berlin)
|
||||
event.add("dtstart", start_dt)
|
||||
event.add("dtend", end_dt)
|
||||
cal.add_component(event)
|
||||
|
||||
return cal.to_ical()
|
||||
|
||||
|
||||
def build_ical(by_date: dict[date, list[str]], output_path: str) -> None:
|
||||
"""iCal-Kalender erstellen und in output_path schreiben (12:00, Europe/Berlin)."""
|
||||
with open(output_path, "wb") as f:
|
||||
f.write(build_ical_bytes(by_date))
|
||||
|
||||
|
||||
def refresh_speiseplan(base_url: str = BASE_URL) -> tuple[dict[date, list[str]], bytes] | None:
|
||||
"""
|
||||
Kompletten Ablauf ausführen: URLs holen, PDFs laden, parsen, mergen, iCal bauen.
|
||||
Gibt (by_date, ical_bytes) zurück, bei Fehler oder keinen Daten None.
|
||||
"""
|
||||
try:
|
||||
urls = fetch_speiseplan_pdf_urls(base_url)
|
||||
if not urls:
|
||||
return None
|
||||
all_parsed: list[list[tuple[date, list[str]]]] = []
|
||||
for url in urls:
|
||||
try:
|
||||
pdf_bytes = download_pdf(url)
|
||||
text = extract_text_from_pdf(pdf_bytes)
|
||||
days = parse_speiseplan_text(text)
|
||||
all_parsed.append(days)
|
||||
except Exception:
|
||||
continue
|
||||
if not all_parsed:
|
||||
return None
|
||||
by_date = merge_day_events(all_parsed)
|
||||
ical_bytes = build_ical_bytes(by_date)
|
||||
return (by_date, ical_bytes)
|
||||
except Exception:
|
||||
return None
|
||||
|
||||
|
||||
def main() -> None:
|
||||
parser = argparse.ArgumentParser(
|
||||
description="Speiseplan von kantine-bhz.de als iCal exportieren."
|
||||
)
|
||||
parser.add_argument(
|
||||
"-o",
|
||||
"--output",
|
||||
default=DEFAULT_OUTPUT,
|
||||
metavar="FILE",
|
||||
help=f"Ausgabedatei .ics (Standard: {DEFAULT_OUTPUT})",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--url",
|
||||
default=BASE_URL,
|
||||
metavar="URL",
|
||||
help=f"Basis-URL der Kantine (Standard: {BASE_URL})",
|
||||
)
|
||||
args = parser.parse_args()
|
||||
|
||||
output_path: str = args.output
|
||||
if not output_path.lower().endswith(".ics"):
|
||||
output_path = output_path + ".ics"
|
||||
|
||||
print("Speiseplan-PDFs von", args.url, "laden ...")
|
||||
urls = fetch_speiseplan_pdf_urls(args.url)
|
||||
if not urls:
|
||||
print("Keine Speiseplan-PDFs gefunden.")
|
||||
return
|
||||
print(f" {len(urls)} PDF(s) gefunden.")
|
||||
|
||||
all_parsed: list[list[tuple[date, list[str]]]] = []
|
||||
for url in urls:
|
||||
try:
|
||||
pdf_bytes = download_pdf(url)
|
||||
text = extract_text_from_pdf(pdf_bytes)
|
||||
days = parse_speiseplan_text(text)
|
||||
all_parsed.append(days)
|
||||
except Exception as e:
|
||||
print(f" Fehler bei {url}: {e}")
|
||||
|
||||
if not all_parsed:
|
||||
print("Keine Daten aus PDFs gelesen.")
|
||||
return
|
||||
|
||||
by_date = merge_day_events(all_parsed)
|
||||
build_ical(by_date, output_path)
|
||||
print(f"Kalender mit {len(by_date)} Terminen geschrieben: {output_path}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
6
requirements.txt
Normal file
6
requirements.txt
Normal file
@@ -0,0 +1,6 @@
|
||||
requests>=2.28.0
|
||||
beautifulsoup4>=4.11.0
|
||||
pdfplumber>=0.10.0
|
||||
icalendar>=5.0.0
|
||||
flask>=3.0.0
|
||||
gunicorn>=21.0.0
|
||||
Reference in New Issue
Block a user