Marketing a Personal GitHub Pages + Astro Blog (SEO & GEO)
This is the surface I actually own. A static Astro blog on GitHub Pages is about as SEO-friendly as it gets by default — but “by default” hides a couple of gotchas that quietly cost you ranking, and getting cited by AI engines needs a deliberate push because a brand-new domain has no authority yet. Here’s the whole playbook, using this site (zywkloo.github.io) as the example.
摘要:Astro + GitHub Pages 的技术 SEO 基本免费,但有几个暗坑(尤其 title 混入 HTML 标签)。GEO 上,新域名没权重,靠双语内容 + canonical 跨站转载快速被 AI 收录引用。
The technical SEO Astro gives you for free (just don’t forget it)
Astro’s static output (SSG) with near-zero client JS already does most of the work. The few things to actually wire up:
- Sitemap.
@astrojs/sitemapis enabled inastro.config.mjs; after a build it emitssitemap-index.xml. The one manual step that matters: submit it to Google Search Console so crawling is pull, not wait-and-pray. - Per-page metadata via
BaseHead.astro. Pass a realtitle(50–60 chars) anddescription(120–150 chars, with the keywords you want —iOS,React Native,FastAPI) to every page.BaseHeadalready emits the<link rel="canonical">, which quietly saves you from duplicate-content penalties against thegithub.iomirror. - Fonts & images.
BaseHeadpreloads the Atkinson font (kills the flash, helps LCP); use Astro’s<Image />for posts so JP/PNG auto-convert to WebP/AVIF. Both feed straight into Core Web Vitals.
None of this is clever. It’s just a checklist you run once.
The gotcha that actually bites: HTML in your <title>
关键坑:Frontmatter 的
title必须是纯文本,不能塞<em>之类标签。
This one’s worth calling out because it’s silent and self-inflicted. If a post’s frontmatter title contains an HTML tag:
title: 'TokenChef (Part 4): Stop Reinventing the Agent — <em>wtcraft</em>'
…that tag renders straight into <head>’s <title> element. The result: garbled character entities (<em>) in the browser tab and in search results, and Google flagging the page as non-canonical and demoting it. The frontmatter title must be plain text. If you want a word highlighted in the rendered heading, do it in the Astro component at render time (a regex pass that wraps the term), never in the data source.
GEO for a bilingual blog
The blog’s job in the GEO economy is to be the citable explanation behind your projects. Two things specific to a bilingual blog:
- Bilingual descriptions. If a post is中英双语, the
descriptionshould carry retrieval keywords in both languages so search engines in either language match the right snippet:description: "Comparing OpenAI Codex vs. Claude Code architecture. / 从 Claude Code 源码泄露与 CCB 逆向架构出发,探讨 wtcraft 如何做实任务边界与验收。" - Question-shaped headings, not literary ones. Replace
## 引子:从一次源码泄露说起with something a person would actually ask a model:## What is the six-layer Claude Code architecture reverse-engineered by CCB?. Then answer it immediately underneath, and add a 2–3 question FAQ at the end — that high-density block is what the retriever pulls first.
This is the same BLUF/table/FAQ discipline from the overview, applied to long-form prose.
Getting discovered fast: canonical cross-posting
A new GitHub Pages domain has no authority, so search engines are slow to find it. You borrow authority from high-DA platforms — without splitting your own — by cross-posting with a canonical pointer home.
摘要:用高权重平台(Dev.to、Medium)发英文版,但把 Canonical URL 设回自己站 — 流量和权重结算到你的域名,不是平台。
- Dev.to / Medium. Publish the English version on Dev.to (DA 90+); high-authority posts often hit Google’s first page within minutes. The non-negotiable step: in the post settings, set the Canonical URL to your own page:
https://zywkloo.github.io/blog/dont-build-another-agent-codex-claude-code-wtcraft/Google then settles the backlinks, traffic, and social signals onto your domain and accelerates indexing of your original. - Where the AI crawlers loiter. Share to Hacker News (Show HN) and topical subreddits (e.g. r/LocalLLaMA). Real-time engines like Perplexity watch trending links on these platforms, which can pull your post into their RAG index almost immediately.
Verify it (验收)
A week after publishing, run the identity and project prompts from the overview on Perplexity and ChatGPT Search. Passing looks like: an answer about “Victor Zhang (zywkloo)” that’s accurate and links back here, and a wtcraft explanation that cites a zywkloo.github.io/blog/... page rather than a third-party rehash. If a third-party copy outranks your original, your canonical tags aren’t set right — fix those first.