# Most Frequent Character in a Country Name

This analysis identifies the country whose official name contains the most frequently repeated letter.  
Because “official” country lists vary across sources, the dataset used here is the **United Nations list of sovereign states**, as published on [Wikipedia](https://en.wikipedia.org/wiki/List_of_sovereign_states#List_of_states).

The following Python script counts letters in each country name (case-insensitive, ignoring spaces and punctuation) and reports the country with the single most frequently repeated character.

```python
def char_count(name: str):
    """
    Return (most_frequent_char, count, error_or_none) for a given country name.
    Counts letters only, case-insensitive.
    """
    if not isinstance(name, str) or not name:
        return None, 0, "err: passed an empty string or non-string value"

    counts = {}
    for ch in name.lower():
        # Use .isalpha() to ignore spaces, commas, etc.
        # Remove this check if you want to count all characters
        if ch.isalpha():
            counts[ch] = counts.get(ch, 0) + 1

    if not counts:
        return None, 0, "err: no countable characters"

    # Pick the (char, count) pair with the highest count
    most_char, max_count = max(counts.items(), key=lambda kv: kv[1])
    return most_char, max_count, None


def main():
    countries = [
        "Afghanistan", "Albania", "Algeria", "Andorra", "Angola",
        "Antigua and Barbuda", "Argentina", "Armenia", "Australia",
        "United States of America", "Uruguay", "Uzbekistan", "Vanuatu",
        "Venezuela, Bolivarian Republic of", "Viet Nam", "Yemen",
        "Zambia", "Zimbabwe"
    ]

    max_country = None
    max_char = None
    max_count = -1

    for country in countries:
        ch, cnt, err = char_count(country)
        if err is None and cnt > max_count:
            max_country, max_char, max_count = country, ch, cnt

    print(
        "Most frequently repeated character\n"
        f"Country: {max_country}\nChar: {max_char}\nNum: {max_count}"
    )


if __name__ == "__main__":
    main()
```

## Output

The script identified **“United Kingdom of Great Britain and Northern Ireland”** as the country name with the most repeated letter.  
The letter **“n”** appears **seven times**.

```
Most frequently repeated character
Country: United Kingdom of Great Britain and Northern Ireland
Char: n
Num: 7
```

---

# Prompt & Model Experimentation

To evaluate how different LLMs handled this question, I tested multiple prompts and models using the OpenAI API.  
The experiment compared various phrasing strategies and model versions to measure accuracy and consistency.

## Models Tested

| Model | Accuracy | Required Source Prompt | Notes |
|--------|-----------|-----------------------|--------|
| GPT-4  | ❌ Often incorrect | ✅ Yes | Miscounted or inconsistent results |
| GPT-4o | ❌ Similar to GPT-4 | ✅ Yes | Slightly improved consistency |
| GPT-5  | ✅ Correct | ✅ Yes | Matched expected answer consistently |

## Prompt Characteristics

1. A simple text prompt asking the core question.  
2. Extended prompt instructing to treat vowels and consonants equally, and include multi-word names.  
3. Further extension emphasizing inclusion of stop words (e.g., *the*, *of*, *and*).  
4. More detailed instructions including a step-by-step task list.  
5. Prompts directing the model to use the **United Nations Member States** list as the official country source.

---

# Discussion on Results

### Performance by Model

Older models such as GPT‑4 and GPT‑4o performed inconsistently. In most cases, they produced incorrect results or miscounted letters.  
GPT‑5, by contrast, returned accurate and consistent results—especially when explicitly prompted to reference the UN Member States list.

### Why GPT‑5 Succeeded

The GPT‑5 model responded correctly across multiple prompt variations.  
The most reliable answers came from prompts that:

* Used GPT‑5  
* Included a link to the UN website as the authoritative source  
* Clearly explained how to count letters, including conjunctions and prepositions  

This suggests that GPT‑5’s performance benefits from both precise task instructions and explicit grounding in a definitive dataset.

---

# Successful Prompts and Responses

Below are selected prompt–response pairs in JSON format.

```json
[
  {
    "model": "gpt-4",
    "category": "g-promptLetterDescMinorWordsTaskList-WithSource",
    "prompt_text": "In the context of world geography, can you tell me what country has the same letter repeated the most in its name?...",
    "prompt_resp": "From my training data, the longest country name is 'The United Kingdom of Great Britain and Northern Ireland'..."
  },
  {
    "model": "gpt-5",
    "category": "b-promptSimple-WithSource",
    "prompt_text": "In the context of world geography, can you tell me what country has the same letter repeated the most in its name?...",
    "prompt_resp": "Short answer: United Kingdom of Great Britain and Northern Ireland..."
  }
]
```

---

# Summary

The country name **“United Kingdom of Great Britain and Northern Ireland”** contains the most frequently repeated letter (**n = 7**) among UN‑recognized sovereign states.  
Across multiple model generations, GPT‑5 consistently produced the correct result when given detailed instructions and a definitive country list source.
