Discharge Status Codes: A Guide to Mapping & ETL

Your nightly ETL job fails on a two-character value in patient_disposition. The source file says 63. Your lookup sheet is outdated, the analyst wants a discharge destination rollup by morning, and billing is asking whether the same field affects reimbursement. That's a familiar moment in healthcare data engineering.
Discharge status codes look small, but they sit at the intersection of claims logic, patient flow, and analytic truth. If you treat them as a loose text attribute, your downstream data drifts fast. If you handle them as a controlled source vocabulary with explicit mapping, validation, and versioning, they become manageable.
Understanding Discharge Status Codes
A discharge status code is one of the smallest fields on an institutional claim, and one of the easiest to mishandle in ETL. On the UB-04, it sits in Form Locator 17 and records the patient's status at the close of the billed stay or billing cycle.
For data engineering, the important point is not the code list itself. It is the business meaning attached to that code at extract time. The field can represent discharge home, transfer to another facility, death, or a patient who remains in care. Those distinctions affect claim interpretation, downstream cohort logic, and how analysts classify care transitions.
Teams usually run into trouble when they treat patient_disposition as a simple text attribute. Source systems label it differently, store it with or without leading zeros, and sometimes populate it from a local discharge workflow instead of the UB-04 value set. I have seen feeds where 03 meant skilled nursing facility in one table and a home health disposition in another because an interface remapped local codes without preserving the original source vocabulary.
That is why static lookup sheets fail over time. They explain the code, but they do not preserve provenance, version history, or source-specific exceptions. A stronger approach is to store three things separately:
- the raw source value,
- the confirmed source vocabulary,
- the mapped standard concept used in OMOP analytics.
Once those are distinct, code review and audit trails get much easier.
This matters outside analytics too. Discharge status often influences claim editing and operational reporting. If your revenue cycle team is trying to improve clean claim rates, this field cannot be left as an unvalidated passthrough.
Why ETL teams misclassify them
The hard part is context.
A code by itself does not prove the source is using the official UB-04 patient discharge status domain. You need to confirm the field definition, the sending application, and whether the value was captured by registration, case management, abstracting, or a billing rules engine. Each path introduces different failure modes.
Common production issues include:
- missing leading zeros, such as
3instead of03 - alphanumeric local variants mixed into the same column
- values that are valid historically but no longer used in current source guidance
- a discharge status that conflicts with encounter end date, discharge disposition text, or the next known facility encounter
What the field represents in analytics
Discharge status describes the endpoint of the current billed stay. It does not prove the next encounter occurred, that the transfer completed, or that your warehouse captured the receiving facility. Analysts often overread this field and treat it as a full care journey marker. It is narrower than that.
A reliable pipeline handles discharge status as a controlled source vocabulary with explicit mapping logic. In practice, that means version-controlled mapping files, validation rules tied to encounter type, and programmatic concept lookup rather than a spreadsheet that drifts over time. If you are standardizing into OMOP, OMOPHub API and SDK workflows fit this pattern well because they let you resolve, store, and test mappings as code instead of relying on a static reference tab.
Quick Reference for UB-04 Discharge Status Codes
A failed claim edit lands in your work queue at 6:30 a.m. The encounter closed overnight, the discharge status is 3, the billing team expected 03, and your ETL already mapped the row into a generic transfer bucket. This is the point where a quick reference stops being a documentation nicety and becomes an operational control.
The table below gives a practical reference for common UB-04 discharge status codes you will see in source feeds and mapping files.
Common UB-04 discharge status codes and meanings
| Code | Official Description |
|---|---|
| 01 | Discharge to home or self care |
| 02 | Transfer to another short-term general hospital |
| 03 | Transfer to skilled nursing facility |
| 09 | Admitted as an inpatient to the same hospital |
| 20 | Expired |
| 30 | Still a patient or expected to return for outpatient services |
| 61 | Discharge or transfer to a hospital-based Medicare approved swing bed |
| 62 | Discharge or transfer to an inpatient rehabilitation facility |
| 63 | Discharge or transfer to a long-term care hospital |
| 64 | Discharge or transfer to a Medicaid-certified nursing facility |
| 65 | Discharge or transfer to a psychiatric hospital or psychiatric distinct part unit |
| 66 | Discharge or transfer to a Critical Access Hospital |
| 70 | Discharge or transfer to another type of health care institution not otherwise defined |
These descriptions align with the commonly used CMS-derived reference list summarized by CGS Medicare.
How to use the table without turning it into a brittle lookup
Use this table as a review artifact, not as the final mapping layer. In production ETL, the source of truth should be a version-controlled mapping set that preserves the raw code, normalizes formatting such as leading zeros, and records the target OMOP or analytic category chosen by your team.
That distinction matters. A human-readable table helps analysts read 62 correctly. It does not tell your pipeline how to handle deprecated values, payer-specific local variants, or source systems that collapse multiple discharge concepts into one internal disposition field.
A practical use pattern looks like this:
- Inbound profiling: compare distinct source values against the expected UB-04 domain and flag padding issues such as
3versus03. - Mapping review: let analysts and implementers verify whether a code belongs to home, acute transfer, post-acute transfer, death, or in-progress billing logic.
- Code-driven ETL: store the mapping in Git, test it, and resolve standard concepts programmatically with tools such as the OMOPHub API and SDKs instead of maintaining an untracked spreadsheet.
For billing and revenue cycle teams, discharge status quality often starts upstream in registration, case management, and claim preparation. This overview of the UB-04 workflow and ways to improve clean claim rates is useful context when the same coding issue keeps resurfacing across files and claim edits.
Keep the reference table in your repo. Keep the executable mapping logic beside it. That combination scales better than a static list copied from one project to the next.
Detailed Meanings and Common Groupings
The short descriptions are accurate, but they're not enough for in-depth analytics. ETL gets easier when you group discharge status codes by what they mean operationally.
Home and community outcomes
01 is the familiar one. It indicates discharge to home or self care. Analysts often treat it as a low-complexity endpoint, but that can hide a lot of variation because “home” says nothing about services arranged after discharge.
If your reporting needs to distinguish unsupported home discharge from home with services, the discharge status code alone won't do it. You'll need other source elements.
Transfers to acute and post-acute settings
Confusion often begins here.
02 points to transfer to another short-term general hospital. That usually belongs in an acute transfer bucket, not a generic “facility” bucket.
Codes such as 03, 61, 62, 63, 64, 65, and 66 point to a range of post-discharge institutions. They shouldn't be flattened into one generic “institutional discharge” category unless your analysis is unconcerned with destination type.
A practical grouping looks like this:
| Group | Typical codes | Why it matters |
|---|---|---|
| Acute transfer | 02, 66 | Supports transfer chain logic and inter-facility flow analysis |
| Skilled or nursing facility | 03, 64 | Important for post-acute utilization and care setting rollups |
| Rehab and specialty hospital | 61, 62, 63, 65 | Often affects readmission logic, destination cohorts, and care pathway analysis |
| Home or self care | 01 | Common endpoint, but limited detail by itself |
| Administrative or unresolved status | 09, 30 | Usually needs encounter-level interpretation |
| Death | 20 | Requires strong consistency checks against mortality data |
Codes that need extra caution
09 can be mishandled if teams think of every discharge status code as a final destination. It means the patient was admitted as an inpatient to the same hospital. In ETL terms, that often signals a transition in billing or care classification inside the same institution, not an external discharge.
30 is another trap. “Still a patient” or “expected to return” usually means the billing cycle closed, but the episode didn't end in the everyday sense. If you're building longitudinal visit logic, don't let 30 masquerade as a completed discharge.
The most expensive mistakes don't come from obscure codes. They come from oversimplifying common ones into a destination model they were never meant to support.
Grouping strategy that works
For analytics, I recommend a two-layer design.
The first layer preserves the exact source discharge status code and its direct standardized meaning. The second layer derives an analytic discharge category such as home, acute transfer, post-acute institutional transfer, death, or ongoing episode. That gives researchers and BI teams something usable without destroying source fidelity.
What doesn't work is a one-column rollup with labels like “discharged” and “transferred.” Those categories are too broad to support reimbursement review, patient journey tracing, or quality logic.
Why Standardizing Codes with OMOP is Critical
A common failure pattern shows up during multi-site ETL. Hospital A sends valid UB-04 discharge status codes. Hospital B sends local labels derived from the same billing field. Hospital C preserves the raw code but omits enough metadata that nobody can tell whether 03 means a transfer, a custom internal category, or a stale mapping from an old interface. If those values go straight into analytics, the same patient outcome gets counted three different ways.

Why source codes alone aren't enough
In OMOP, discharge destination belongs in a vocabulary-backed model. That gives ETL teams one consistent representation of destination semantics across feeds, vendors, and refresh cycles.
The payment impact is real, too. In institutional billing, the discharge status code functions as a claim-processing control. CMS states that the code must be supported by the medical record and match the patient's actual post-discharge setting, and notes in its guidance on why patient discharge status codes matter that incorrect coding can lead to overpayment or underpayment.
Standardization also fixes a problem that static lookup tables usually miss. Lookup tables answer “what does this code mean” on one day. Production ETL needs to answer “what did we map, from which source vocabulary, under which version, using which rule.” That is the difference between a one-time conversion and a mapping system you can maintain. A version-controlled semantic mapping workflow is what keeps discharge status logic reproducible when source feeds drift or vocabulary content changes.
The practical payoff in OMOP
For implementation, I use three layers and persist all three:
- Raw source value: what arrived in the claim or ADT feed
- Source vocabulary concept: the identified meaning in the discharge status vocabulary
- Standard concept: the normalized concept used for analytics and cross-site queries
That structure solves real downstream problems.
- Cross-site comparability: different sender conventions can resolve to the same standard concept
- Auditability: analysts can trace any cohort count back to the submitted value and mapping rule
- Change control: vocabulary updates become reviewable ETL changes instead of hidden spreadsheet edits
The trade-off is maintenance. Teams have to own vocabulary-aware mapping logic, test it, and version it with the rest of the pipeline. That work is still cheaper than reconciling broken transfer rates, post-acute utilization counts, or mortality analyses after inconsistent discharge mappings have already reached production.
Mapping Strategies to OMOP and SNOMED CT
A reliable mapping pipeline doesn't jump straight from 03 to an analytic label. It moves through a vocabulary model. That model is what keeps your ETL explainable.

The mapping path
The usual path is:
- identify the raw source value,
- confirm the source vocabulary,
- find the corresponding source concept,
- follow concept relationships to the standard concept,
- persist both for traceability.
That means your ETL should never store only the final standard concept. Keep the original source code and source concept metadata with it.
Here's the mental model:
| Stage | What you store | Why it matters |
|---|---|---|
| Raw extract | Original code and source field name | Preserves provenance |
| Source vocabulary mapping | Source concept ID and vocabulary | Confirms semantic identity |
| Standardization | Standard concept ID | Supports consistent analytics |
| Audit metadata | Vocabulary version and mapping rule | Supports reproducibility |
Why the source vocabulary matters
If the feed really uses the UB-04 patient discharge status vocabulary, the source code can be resolved through that vocabulary before you move to the standard concept. If it doesn't, the first task is not “find the standard concept.” The first task is “decode the local meaning.”
That distinction prevents a common failure mode: mapping local labels that resemble UB-04 values but don't carry UB-04 semantics.
For a broader view of how this source-to-standard path works across clinical domains, OMOPHub's article on semantic mapping in healthcare data pipelines is a useful companion.
A short walkthrough helps before you automate the pattern:
What good mapping logic looks like
Good logic is explicit about uncertainty. It distinguishes:
- Exact vocabulary match: canonical UB-04 code found and mapped
- Local alias resolved: local code translated to UB-04 meaning first
- Ambiguous source: insufficient evidence to map confidently
- No match: retained as unmapped and routed for review
Bad logic collapses all four into one branch and automatically assigns a destination concept anyway.
That silence is what breaks trust in longitudinal analytics.
Programmatic Concept Lookups with OMOPHub
At some point, every team hits the same failure mode. A discharge status code that looked settled in a spreadsheet turns out to map differently across source systems, and now the ETL has to explain why last month's counts changed. Programmatic lookups fix that by putting the mapping logic in code, under version control, with outputs you can test and review.

Start with interactive exploration
The Concept Lookup tool is a good first pass when you need to inspect a code, check the source vocabulary, or confirm that a candidate concept exists before you wire the lookup into ETL.
For the broader workflow, OMOPHub's guide to OMOP concept mapping workflows shows how source concepts, relationships, and standard targets fit together.
Python example for lookup-driven ETL
In production, the goal is not just to find a concept. The goal is to make the same decision every time, record how that decision was made, and fail safely when the source value is unclear.
The Python and R SDK repositories are the practical starting point for that pattern: Python and R. If your team uses generated code snippets or LLM-assisted development, verify endpoint names and method signatures against the current full LLM-friendly docs export before you ship changes.
Here's a practical Python sketch:
from omophub import OMOPHub
client = OMOPHub(api_key="YOUR_API_KEY")
def resolve_discharge_status(source_code: str):
code = source_code.strip().zfill(2)
source_candidates = client.concepts.search(
query=code,
vocabulary=["UB04 Patient Discharge Status", "UB04 Pt dis status"]
)
if not source_candidates:
return {
"source_code": code,
"source_concept_id": 0,
"standard_concept_id": 0,
"status": "not_found"
}
source_concept = source_candidates[0]
relationships = client.concepts.relationships(
concept_id=source_concept["concept_id"]
)
standard_targets = [
rel for rel in relationships
if rel.get("standard_concept_id")
]
if not standard_targets:
return {
"source_code": code,
"source_concept_id": source_concept["concept_id"],
"standard_concept_id": 0,
"status": "no_standard_target"
}
standard = standard_targets[0]
return {
"source_code": code,
"source_concept_id": source_concept["concept_id"],
"standard_concept_id": standard["standard_concept_id"],
"status": "mapped"
}
What matters in this pattern is the behavior around the lookup, not just the API call.
- Normalize before search: trim whitespace, left-pad codes, and standardize case if your feed is inconsistent.
- Constrain the vocabulary: discharge status values are easier to resolve correctly when the search is scoped to the expected source vocabulary.
- Persist both layers: keep the source concept ID and the standard concept ID so analysts can audit the translation path later.
- Return an explicit mapping state:
not_found,no_standard_target, andmappedsupport better downstream handling than a single populated concept field. - Treat the lookup logic as code: review changes through pull requests, pin SDK versions, and rerun tests when vocabulary content changes.
I would also avoid a common shortcut here. Do not search globally, grab the top hit, and assume it is safe because the code value looks familiar. Discharge status mapping breaks in quiet ways, especially when local discharge fields reuse short numeric codes that resemble UB-04 values but mean something else.
A coded lookup layer gives you repeatable behavior and a clear audit trail. It also makes versioned remapping possible when vocabularies change or source systems are corrected. That is the key advantage over static lookup tables. You are not just storing mappings. You are building a mapping service that can be tested, reviewed, and rerun at scale.
Building a Robust ETL Process for Discharge Status
Friday night is a common failure point for discharge status ETL. A source feed changes from 06 to 6, one facility starts sending local values that look like UB-04 codes, and by Monday the dashboard shows a spike in home discharges that never happened. The fix is rarely a bigger lookup table. It is a production-ready mapping workflow with explicit rules, version control, and repeatable outputs.
Build a reusable mapping service
Put discharge status mapping behind a shared service in your pipeline, not inside scattered SQL case statements or notebook cells. The service should take the raw source value plus source-system context, normalize the value, determine whether the code is canonical or local, attempt the mapping, and return a structured record that downstream jobs can trust.
That record should include at least:
- Original input: the untouched source value
- Normalized value: the transformed lookup key
- Source vocabulary decision: canonical UB-04, local mapped-to-UB-04, or unknown
- Mapping output: source concept ID, standard concept ID, and status
- Audit metadata: vocabulary version, processing timestamp, and rule identifier
This design pays off during incident review. Instead of asking why a concept ID looks wrong, you can see whether the problem came from normalization, vocabulary selection, or a missing local crosswalk.
Handle non-matches deliberately
Forced mappings create bad analytics. An unknown discharge code mapped to a familiar destination is usually worse than leaving it unmapped and routing it for review.
Use a decision path that reflects how real feeds fail:
- If the code is a confirmed canonical value, map it directly.
- If it is local but documented, translate it to the intended UB-04 meaning first.
- If it is ambiguous, set the standard concept to
0, flag it, and queue it for review. - If the source field is null or not applicable, represent that state explicitly.
Null is not home. It is missing information, and the ETL should preserve that distinction.
Versioning and regression checks
Discharge mapping logic should ship like application code. Store the normalization rules, local crosswalks, and expected outputs in version control. Review changes through pull requests. Rerun tests when source feeds change or when you refresh vocabularies.
A small regression pack catches most breakage:
| Test type | What it checks |
|---|---|
| Canonical code fixtures | Known source values still resolve as expected |
| Unknown-value tests | Invalid values stay unmapped |
| Local alias tests | Custom crosswalk entries still behave correctly |
| Release diff review | Mapping outputs before and after vocabulary update |
I also recommend keeping data quality checks close to the mapper, not in a separate governance backlog. A practical pattern is to pair the mapping service with automated ETL data quality checks for healthcare pipelines so rejected codes, null spikes, and destination shifts are visible in the same release cycle.
As noted earlier, OMOPHub gives you the API and SDK pattern for coded concept resolution. The part that determines whether your pipeline holds up in production is everything around that lookup: source-aware routing, explicit unmapped states, regression coverage, and a full audit trail for every mapped discharge value.
Validation Rules and Handling Edge Cases
Even a clean mapping layer won't rescue bad source coding. Validation has to sit next to mapping, not after it.
A Medicare analysis of hip and knee replacement surgeries found discharge codes were inaccurate in about 9% of discharges, with accuracy varying by destination. Home discharges were correct 82.5% of the time, while long-term care hospital discharges were only 41.1% accurate, according to the published claims analysis in PMC. That gap is exactly why destination-specific validation matters.

Validation rules worth implementing
Use simple rules first. They catch a surprising amount of bad data.
- Date logic: discharge date can't precede admission date.
- Death consistency: if the status is expired, it should align with mortality data captured elsewhere in your model.
- Transfer plausibility: transfer destinations should be consistent with available follow-on encounter data when your sources support that check.
- Encounter state checks: codes indicating same-hospital admission or still-a-patient status shouldn't be treated like ordinary completed discharges.
- Allowed value enforcement: source values should either match the canonical domain, a documented local crosswalk, or be flagged.
Edge cases that deserve their own branch
The problem cases aren't always errors. Some are workflow artifacts.
A few examples:
| Edge case | Recommended handling |
|---|---|
| Blank source field | Preserve null state and flag for source completeness review |
30 with closed encounter | Check whether the billing cycle closed while the care episode continued |
| Transfer code with no corroborating evidence | Keep mapped value if source is authoritative, but mark lower confidence for analytics |
| Local legacy code | Route through a maintained crosswalk, not ad hoc analyst interpretation |
If you're building a broader quality framework around these checks, OMOPHub's guide to data quality checking in OMOP pipelines is a useful starting point.
Audit trails matter most when the source data is plausible but suspicious. You want to show what the claim said, what the mapper did, and which validation rules fired.
What doesn't work
Two habits create long-term pain:
- Silent correction: changing source values without preserving the original
- One-pass validation: running checks only at initial ingest and never again after vocabulary or business-rule updates
Validation should be repeatable and rerunnable. If you can't rerun it, you can't trust historical consistency.
Conclusion Your Path to Reliable Data
Discharge status codes are small fields with outsized impact. They shape claim interpretation, transfer logic, post-acute analysis, and destination cohorts. They also fail in very ordinary ways: unclear source definitions, local variants, overbroad rollups, and missing validation.
The fix isn't another static spreadsheet. It's a disciplined ETL pattern. Confirm the source vocabulary. Preserve the raw value. map through source concepts to standard concepts. Store audit metadata. Validate the result against encounter logic and supporting evidence.
That approach gives you something most healthcare data teams need more of: reproducibility. When a researcher asks why a visit landed in a post-acute bucket, or when compliance asks how a claim-derived destination was standardized, you should be able to answer from code and metadata, not memory.
If you're cleaning up a legacy pipeline, start with a profile of distinct discharge status values by source system. Separate canonical UB-04 codes from local variants. Then move the mapping logic into a tested function or service. Once that foundation is in place, the rest of the analytics stack gets easier to trust.
If you want to implement a version-controlled, API-driven approach instead of maintaining local vocabulary infrastructure, OMOPHub provides programmatic access to OMOP standardized vocabularies so teams can search concepts, traverse relationships, and build discharge-status mapping into ETL workflows with auditability.


