Chapter 20: Text Analytics in Asylum Adjudication — Pangaea Text, ATLAS, and the Legal Framework for Algorithmic Fraud Screenin

The previous chapters examined immigration AI systems that classify people for supervision and custody. This chapter moves to a different and more sensitive domain: asylum adjudication. The legal question here is not whether a person will report to ICE or whether they will be detained on bond. It is whether software can influence how the government evaluates a claim of persecution.

That matters because asylum law is built on individualized assessment. The refugee definition under the Immigration and Nationality Act, 8 U.S.C. § 1101(a)(42), and the Convention and Protocol Relating to the Status of Refugees require that protection determinations rest on the specific circumstances of the individual applicant. If automated text analysis enters that process — even at a pre-adjudication screening stage — it can alter the evidentiary posture of the case before the officer reaches the merits. That is the legal problem this chapter addresses.

I. Pangaea Text — the primary documented system

DHS/USCIS/PIA-085, Privacy Impact Assessment for Pangaea: Pangaea Text (January 2021)

The most precisely documented public source for text analytics in asylum adjudication is DHS/USCIS/PIA-085, the Privacy Impact Assessment published by USCIS in January 2021 for a system called Pangaea Text.

The PIA describes Pangaea Text as a secure web-based system designed to assist in safeguarding the integrity of the asylum program by identifying fraud, national security, and public safety concerns. The system operates by applying rules and algorithms to digitized asylum applications and supplementary written statements — specifically the narrative sections of Form I-589, Part B — to detect patterns that could constitute indicators of fraud, national security, and public safety concerns. Pangaea Text transforms unstructured text from those applications into structured data that can be analyzed for pattern detection.

The PIA is explicit on one point of legal significance: Pangaea Text does not perform system-automated adjudications. All results are reviewed by USCIS personnel. When the system detects a relevant pattern or indicator, an Asylum Officer or Fraud Detection and National Security officer performs a manual review to evaluate the flag before any further action is taken.

That statement delimits both the system’s function and the legal argument. Pangaea Text does not deny asylum. But it can generate a fraud or risk indicator that enters the file and shapes the environment in which an officer subsequently evaluates the applicant’s credibility. The legal significance of that function is not diminished by the fact that a human signs the final decision.

II. ATLAS — the rules engine that connects screening to adjudication

DHS/USCIS/PIA-084, Privacy Impact Assessment for ATLAS (July 2021)

Pangaea Text operates within a broader USCIS screening architecture. The system most directly connected to adjudication outcomes is ATLAS — the Automated Targeting, Leads, and Analysis System — documented in DHS/USCIS/PIA-084, published in July 2021.

ATLAS is a rules-based screening system operated by USCIS’s Fraud Detection and National Security Directorate. It applies rule sets to immigration application data to generate screening referrals and, in certain circumstances, Statements of Findings that may inform eligibility determinations. The PIA confirms that ATLAS rules undergo operational, legal, privacy, and civil rights review before implementation, and that new rules are not created or implemented without approval from USCIS executive leadership. The PIA also states that ATLAS prohibits the consideration of race or ethnicity in screening and law enforcement activities except in the most exceptional instances.

ATLAS and Pangaea Text operate as complementary components of the same screening architecture. Pangaea Text handles the text-analysis layer — detecting patterns in narrative content — while ATLAS provides the broader rules-engine framework within which those detections are categorized, tracked, and potentially referred for further review. A flag generated by Pangaea Text may enter the FDNS-DS data system — the Fraud Detection and National Security Data System documented in DHS/USCIS/PIA-013 — where it becomes part of the case record accessible to adjudicating officers.

That layered architecture is legally relevant. It means that a fraud indicator generated by automated text comparison can travel from the screening system into the adjudicative case file without any formal disclosure to the applicant that automated screening occurred or that a flag was generated.

III. Why text pattern detection is legally different from ordinary fraud investigation

Fraud detection is a legitimate function. The INA authorizes USCIS to investigate and deny applications that are fraudulent, and the Fraud Detection and National Security Directorate exists precisely to exercise that authority. The legal concern with automated text analytics is not that fraud screening occurs. It is how the screening methodology interacts with the specific characteristics of asylum narratives.

Asylum applications frequently exhibit textual similarities for reasons that have no connection to fraud. Applicants who used the same legal aid organization, the same community interpreter, or the same asylum clinic may produce narratives with similar structural patterns, vocabulary, or formatting. Applicants describing the same form of systematic persecution — gang violence, political targeting, domestic abuse — may naturally use similar terminology to describe the same recurring experience. Translation from a non-English language can compress the vocabulary available for describing particular experiences, producing convergent language across independent accounts.

A text-comparison system trained to detect overlap and flag similarity as a fraud indicator may therefore confuse shared lawful resources with coordinated fabrication. The OCR process used to digitize paper I-589 forms introduces an additional layer of error: the PIA itself acknowledges that OCR has the potential to misrecognize characters, producing inaccurate text representations that may affect pattern detection in unpredictable ways.

The evidentiary problem is structural. A fraud flag based on textual similarity does not identify a specific false statement. It identifies a pattern that the system has been designed to treat as suspicious. If that pattern enters the adjudicative record without the applicant’s knowledge, the applicant cannot explain why the similarity exists or demonstrate that it reflects lawful shared resources rather than fabrication.

IV. The EU AI Act framework — high-risk classification and pre-deployment obligations

EU AI Act, Regulation (EU) 2024/1689, Annex III, Articles 10, 13, 14, and 27

Under the EU AI Act, a system with Pangaea Text’s function would fall within the high-risk classification under Annex III as an AI system used in migration, asylum, and border control management for risk assessment purposes with potential effects on legal status and access to international protection.

Article 10 requires that training, validation, and testing datasets be examined for possible biases and that their representativeness for the intended purpose be assessed. A text-analysis system trained on asylum narratives needs to account for the linguistic, cultural, and procedural factors that produce benign textual similarity — translation effects, community legal resources, persecution type convergence. If those factors are not accounted for in the training and validation data, the system will produce false positives for populations with high rates of shared legal resources or similar persecution experiences.

Article 13 requires high-risk systems to be sufficiently transparent that deployers can interpret outputs and use them appropriately. A fraud indicator that arrives without explanation of which textual patterns triggered it and why does not obviously allow the reviewing officer to evaluate its reliability or to distinguish a genuine fraud signal from an artifact of shared legal assistance.

Article 14 requires effective human oversight — meaning that the reviewing officer must be capable of understanding the system’s limitations and must be able to disregard its output when appropriate. An officer who receives a fraud flag with no explanation of its basis and no information about the system’s known false-positive rate is not positioned to exercise effective oversight in the Article 14 sense.

Article 27 requires a fundamental-rights impact assessment for public-sector deployers of high-risk systems. An asylum text-analytics system whose flags can influence the credibility evaluation of protection claims has a direct potential impact on the right to asylum and on the prohibition of refoulement — rights protected at the highest level of international and European law. A genuine pre-deployment assessment of those risks would need to address the false-positive problem, the translation effect, and the absence of disclosure to the applicant before those risks can be considered adequately mitigated.

V. The US legal framework — due process and administrative law

U.S. Constitution, Amendment V Immigration and Nationality Act, 8 U.S.C. § 1158 Administrative Procedure Act, 5 U.S.C. § 706

In the United States, the legal framework for challenging the use of text analytics in asylum adjudication runs through procedural due process, the INA’s statutory requirements for asylum adjudication, and the APA’s arbitrary-and-capricious standard.

The Fifth Amendment’s due process guarantee applies to asylum applicants in removal proceedings. The Board of Immigration Appeals and the federal circuit courts have recognized that due process requires a meaningful opportunity to be heard and to present evidence in asylum and withholding proceedings. If a fraud indicator generated by automated text screening materially influences the officer’s credibility evaluation without disclosure to the applicant, the applicant cannot meaningfully respond to it. They cannot explain why their narrative resembles others, demonstrate that the similarity reflects shared legal assistance rather than fabrication, or challenge the reliability of the detection methodology.

The INA’s asylum provisions require that credibility determinations be based on the totality of the circumstances and all relevant factors, including the applicant’s demeanor, candor, responsiveness, and the consistency of the applicant’s statements with other evidence. 8 U.S.C. § 1158(b)(1)(B)(iii). An undisclosed automated fraud flag that influences the officer’s credibility posture without entering the record as evidence — and without being subject to challenge — does not fit within that framework. It is not a factor the applicant can address because they do not know it exists.

Under the APA, a denial of asylum in removal proceedings may be reviewed for arbitrary and capricious agency action where the denial is based on reasoning that is not rationally connected to the evidence. If an asylum denial rests in material part on a fraud suspicion that originated in automated text comparison, and that comparison methodology has not been validated for the specific linguistic and cultural context of the application, the reasoning chain from screening output to credibility finding to denial may be vulnerable to APA challenge.

VI. Practical strategy — visibility, record-building, and de-linking

For the practitioner, the most actionable strategy in cases where automated asylum screening may have been a factor is record-building before the merits hearing, not after.

The first step is to establish whether screening occurred. A Privacy Act or FOIA request specifically identifying Pangaea Text, ATLAS, and the FDNS-DS system of records can reveal whether a fraud or risk indicator was generated for the applicant’s file. The FDNS-DS SORN — DHS/USCIS/ICE/CBP-001 — covers the Alien File and related case records. FDNS-DS records are subject to law enforcement exemptions under 5 U.S.C. § 552a(j)(2) and (k)(2), but the request itself creates a record of the inquiry and may produce at minimum a response identifying the applicable exemption, which confirms that records exist.

The second step is to preserve the issue on the record at the earliest opportunity. In an asylum interview or immigration court proceeding, a lawyer who asks whether any automated fraud screening was applied to the applicant’s narrative and whether any flag or indicator was generated is creating a record that may become significant on appeal. If the officer or the government declines to answer, that declination is itself a preserved fact.

The third step, where a fraud suspicion has entered the case, is to explain the innocent basis for any textual similarity. That is not an abstract argument about algorithmic unreliability. It is concrete evidence: a declaration from the legal clinic that prepared the application, a translator’s affidavit explaining vocabulary choices, documentation of the applicant’s specific and individualized factual circumstances that distinguish their account from the general pattern, and — where available — expert testimony on trauma narrative patterns or translation effects in the relevant language and cultural context.

This is the legaltech version of rectification in asylum cases. The goal is not to prove the algorithm wrong as an engineering matter. It is to prove that the algorithm’s output, whatever it was, does not accurately characterize this applicant’s submission — and that the factual record, correctly understood, does not support the suspicion the system generated.

Next: Chapter 21 — AI in criminal pretrial: bail algorithms, risk scores, and the automation of liberty.

Guilty Algorithm