Change string to number: conversion patterns guide
Change string to number: conversion patterns that survive messy data
Every data pipeline eventually hits the same wall: a number arrives as a string. A price comes in as "$49.99", an API returns "1,200" instead of 1200, or a CSV import turns every integer into text. The task to change string to number sounds trivial until you face real-world data with currency symbols, commas, whitespace, and locale-specific decimal separators.
According to the 2024 Kaggle State of Data Science survey, data cleaning consumes 45% of a data scientist's time, and type conversion errors rank among the top three causes of pipeline failures. A Google Cloud research paper (2024) found that 26% of data quality issues in production ML pipelines trace back to type coercion problems.
On CodeWords, data transformation runs inside Python microservices where you have full control over conversion logic — no low-code limitations, no silent type coercion surprises.
Unlike generic AI automation posts, this guide shows real CodeWords workflows — not just theory. You will learn conversion patterns across languages, handle edge cases, and build pipelines that do not break on messy inputs.
TL;DR
- Use
int()andfloat()in Python,Number()orparseFloat()in JavaScript — but always validate and clean the string first. - Real-world strings contain currency symbols, commas, whitespace, and mixed formats. Stripping non-numeric characters before conversion prevents silent failures.
- CodeWords workflows handle type conversion as part of data pipelines, with validation and error routing built into the processing layer.
How do you change string to number in Python?
Python offers explicit conversion functions. No magic, no silent coercion.
Basic conversions:
integer_value = int("42") # 42
float_value = float("3.14") # 3.14
negative = int("-100") # -100
scientific = float("1.5e3") # 1500.0
What fails:
int("42.5") # ValueError — int() rejects decimal strings
int("1,200") # ValueError — commas are not valid
float("$49.99") # ValueError — currency symbols break parsing
int("") # ValueError — empty strings
int(None) # TypeError — None is not a string
Safe conversion pattern:
import re
def safe_to_number(value):
if value is None:
return None
cleaned = re.sub(r"[^\d.\-]", "", str(value))
if not cleaned or cleaned == "-":
return None
return float(cleaned) if "." in cleaned else int(cleaned)
This handles "$1,200.50" → 1200.5, "(42)" → 42, and "" → None. In CodeWords workflows, this pattern lives inside your FastAPI microservice, applied to every record in a batch before downstream processing.
Python's decimal.Decimal is worth mentioning for financial calculations where floating-point precision matters. Decimal("19.99") avoids the 0.1 + 0.2 != 0.3 problem that trips up float().
How do you change string to number in JavaScript?
JavaScript offers multiple approaches, each with different behavior on edge cases:
Number() — strictest built-in:
Number("42") // 42
Number("3.14") // 3.14
Number("") // 0 — surprise! Empty string becomes 0
Number("hello") // NaN
Number("42px") // NaN — rejects trailing non-numeric chars
parseInt() and parseFloat() — more lenient:
parseInt("42px") // 42 — stops at first non-numeric char
parseFloat("3.14em") // 3.14 — same behavior
parseInt("0x1A") // 26 — parses hex
parseInt("08", 10) // 8 — always specify radix 10!
Unary + operator — shortest but least readable:
+"42" // 42
+"" // 0 — same empty string gotcha as Number()
+null // 0 — null becomes 0
+undefined // NaN
The Number("") returning 0 instead of NaN has caused countless bugs. Always validate before converting. The MDN Web Docs detail every edge case.
On CodeWords, JavaScript runs in generated Next.js UIs while Python handles the data pipeline layer. If your workflow generates a front-end that accepts user input, validate and convert on both sides — JavaScript for immediate feedback, Python for authoritative processing.
What edge cases break string-to-number conversion?
Real data is adversarial. These are the cases that break naive conversion:
- Locale-specific decimals: In Germany,
"1.234,56"means one thousand two hundred thirty-four point fifty-six. In the US,"1,234.56"means the same thing. Without locale awareness, you will parse the wrong value. - Currency symbols:
"€49,99","$1,200","¥10000"— strip symbols first, but know which decimal convention the symbol implies. - Percentage strings:
"45%"— do you want45or0.45? Define the contract before converting. - Whitespace:
" 42 ","\t100\n"—strip()in Python andtrim()in JavaScript handle this, but only if you remember to call them. - Boolean-like strings:
"true","false","yes","no"— some systems encode these as strings that downstream logic expects as1/0. - Scientific notation:
"1e10","2.5E-3"—float()handles this correctly, butint()does not. - Numeric strings with leading zeros:
"007"—int("007")gives7in Python, butparseInt("007")was historically octal in some JS engines (always pass radix 10).
A CodeWords workflow can handle these systematically. Build a data cleaning step that runs before any numeric operation — detect the format, clean it, convert it, validate the result.
How do you handle conversion errors in data pipelines?
Silent failures are worse than crashes. A failed conversion that returns 0 instead of raising an error will corrupt your data without any alert.
Pipeline error handling strategies:
1. Fail loudly (strict):
value = int(raw_string) # raises ValueError on bad input
Use this when every record must be valid. Failed conversion stops the pipeline and alerts the operator.
2. Default value (lenient):
try:
value = int(raw_string)
except (ValueError, TypeError):
value = 0 # or None, or a sentinel value
Use this when some bad records are expected and you want the pipeline to continue. Log the default assignments for review.
3. Error routing (production):
successes, failures = [], []
for record in records:
try:
record["amount"] = float(record["amount_str"])
successes.append(record)
except (ValueError, TypeError):
failures.append(record)
Route failures to a separate queue for human review. On CodeWords, push failures to an Airtable error log or a Slack channel while successes continue through the pipeline.
4. AI-assisted parsing:
For truly messy data — free-text fields, OCR output, inconsistent formats — pass the string to an LLM with instructions to extract the numeric value. CodeWords provides native access to OpenAI, Anthropic, and Google Gemini for this kind of fuzzy extraction. Reserve this for cases where rule-based parsing fails.
How do you build type-safe data pipelines on CodeWords?
A complete data pipeline on CodeWords that handles string-to-number conversion:
- Ingest: Pull data from Google Sheets, an API, or a CSV upload via the integrations layer.
- Schema validation: Define expected types for each field. Use Pydantic models in your FastAPI microservice for automatic validation.
- Clean and convert: Apply cleaning functions to each field — strip whitespace, remove currency symbols, normalize decimals, then convert.
- Validate ranges: A converted number that is technically valid but outside expected bounds (negative prices, ages over 200) should be flagged.
- Route errors: Send invalid records to a review queue. Send valid records downstream.
- Output: Write cleaned, typed data to your target — database, spreadsheet, another API, or an AI workflow that needs numeric input.
Pydantic makes step 2 and 3 particularly clean:
from pydantic import BaseModel, field_validator
class Transaction(BaseModel):
amount: float
quantity: int
@field_validator("amount", mode="before")
def parse_amount(cls, v):
if isinstance(v, str):
return float(re.sub(r"[^\d.\-]", "", v))
return v
This runs in every CodeWords ephemeral sandbox, validating and converting data at the pipeline boundary rather than hoping downstream code handles the mess.
FAQs
What is the safest way to change string to number in Python?
Use a try/except block around int() or float(). Clean the string first by removing non-numeric characters. For financial data, use decimal.Decimal() to avoid floating-point precision issues. Always handle None, empty strings, and whitespace explicitly.
Why does JavaScript Number("") return 0 instead of NaN?
It is defined in the ECMAScript specification. Empty strings and whitespace-only strings convert to 0. This is a known footgun. Always validate with isNaN() or use a custom validation function that treats empty strings as invalid.
How do I handle numbers with commas in different locales?
Detect the locale first. If you know the source locale, strip the thousands separator and normalize the decimal separator before converting. Python's locale module and JavaScript's Intl.NumberFormat can help, but explicit string replacement is often more predictable in pipelines.
Can AI help with ambiguous number parsing? Yes. For OCR output, free-text fields, or inconsistent formats, an LLM can extract numeric values with context. On CodeWords, pass ambiguous strings to a model with instructions like "extract the dollar amount from this text" for fuzzy extraction that rule-based parsing cannot handle.
Types are trust boundaries
The moment you change string to number, you are making an assertion: "This data is numeric and I trust it." That assertion needs to be validated, not assumed. The best data pipelines treat type conversion as a trust boundary — the point where raw input becomes typed, validated, usable data.
Build your conversion logic into the pipeline, not as an afterthought. CodeWords gives you the Python execution environment and integration layer to do this cleanly. Check the templates library for data pipeline patterns that include validation and type conversion.





