Skip to content

Conversation

@ajpotts
Copy link
Contributor

@ajpotts ajpotts commented Jan 6, 2026

Enable pandas arithmetic dispatch for Arkouda ExtensionArray

Summary

This PR implements the pandas ExtensionArray arithmetic hook (_arith_method) for
ArkoudaExtensionArray, enabling elementwise arithmetic operations (e.g. +, -, *)
between Arkouda-backed arrays and with scalars.


Motivation

Pandas does not automatically dispatch Python operators (__add__, etc.) for
ExtensionArrays. Instead, arithmetic is routed through _arith_method. Without this
hook, expressions like:

pd.array([1, 2, 3], dtype="ak_int64") + pd.array([4, 5, 6], dtype="ak_int64")

raise TypeError.

Implementing _arith_method is required for:

  • correct pandas operator dispatch
  • future Series / DataFrame arithmetic
  • consistency with pandas ExtensionArray contracts

What’s in this PR

Core functionality

  • Adds _arith_method to ArkoudaExtensionArray
    • Supports EA–EA and EA–scalar operations
    • Returns NotImplemented for unsupported operand types
    • Preserves the concrete EA type on return
  • Adds _from_data constructor helper for mypy-safe instance creation
  • Annotates internal _data attribute for static typing

Typing & correctness

  • Uses typing_extensions.Self for precise self-type returns
  • Uses NotImplementedType (not the value) in return annotations

Tests

  • Adds unit tests covering:
    • EA–EA arithmetic
    • EA–scalar arithmetic
    • NotImplemented propagation
    • User-visible TypeError behavior for unsupported operands
  • Existing argsort / NaN placement tests remain unchanged and passing

Design notes

  • Index alignment is intentionally not handled here; pandas performs alignment
    before calling into the ExtensionArray.
  • Type coercion and promotion semantics are delegated to the underlying Arkouda
    operations.
  • The implementation follows pandas’ recommended EA patterns rather than Python
    operator overloading.

Example

import pandas as pd

x = pd.array([1, 2, 3], dtype="ak_int64")
y = pd.array([10, 20, 30], dtype="ak_int64")

x + y
# ArkoudaArray([11 22 33])

Reviewer notes

  • The _from_data helper is intentionally minimal and centralizes EA construction.
  • Duck-typing (hasattr(other, "_data")) is used instead of concrete EA imports to
    avoid circular dependencies.
  • All changes are localized to the ExtensionArray layer; no pandas behavior is
    modified.

Closes #5230: ArkoudaExtensionArray arithmetic

@ajpotts ajpotts force-pushed the 5230_ArkoudaExtensionArray_arithmetic branch from dd8d9a8 to 7357416 Compare January 6, 2026 12:23
@ajpotts ajpotts marked this pull request as ready for review January 6, 2026 15:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ArkoudaExtensionArray arithmetic

1 participant