Debugging TensorFlow and Pandas Import Issue on macOS Sonoma with Apple M3 Pro

Ilhan Karic
4. June 2024
Reading time: 5 min
Debugging TensorFlow and Pandas Import Issue on macOS Sonoma with Apple M3 Pro

Introduction

When transitioning to a new hardware setup, unexpected issues can arise. This was the case when I moved from a Windows machine to a MacBook Pro with an Apple M3 Pro chip running macOS Sonoma (version 14.5). A peculiar problem emerged: importing pandas before TensorFlow caused the script to freeze, while reversing the import order worked flawlessly. This blog post details the issue, the debugging process, and the solution.

The Issue

When pandas is imported before TensorFlow/Keras, the script freezes without throwing any exceptions, even when wrapped in try/catch blocks. Here’s the minimal script that replicates the issue:

import numpy as np
import os
import pandas as pd
from tensorflow.keras import layers, models

print("Creating simple model...")
try:
    model = models.Sequential([
        layers.Input(shape=(10,)), 
        layers.Dense(64, activation='relu'), 
        layers.Dense(1, activation='linear')
    ])
    print("Model created successfully.")
except Exception as e:
    print(f"Error creating model: {e}")

x_train = np.random.rand(100, 10)
y_train = np.random.rand(100, 1)

# Compile the model
model.compile(optimizer='adam', loss='mean_squared_error')

# Train the model
try:
    model.fit(x_train, y_train, epochs=5, batch_size=32)
    print("Model training completed successfully.")
except Exception as e:
    print(f"Error training model: {e}")

Output When pandas is Imported First

Creating simple model...
2024-05-31 18:04:07.639131: I metal_plugin/src/device/metal_device.cc:1154] Metal device set to: Apple M3 Pro
2024-05-31 18:04:07.639149:
    systemMemory: 36.00 GB
2024-05-31 18:04:07.639154:
    maxCacheSize: 13.50 GB
2024-05-31 tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2024-05-31 18:04:07.639186: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>)

The script freezes at this point and must be terminated.

Output When TensorFlow/Keras is Imported First

Creating simple model...
2024-05-31 18:07:18.879661: I metal_plugin/src/device/metal_device.cc:1154] Metal device set to: Apple M3 Pro
2024-05-31 18:07:18.879680:
systemMemory: 36.00 GB
2024-05-31 18:07:18.879685:
maxCacheSize: 13.50 GB
2024-05-31 tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2024-05-31 18:07:18.879717: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>) Model created successfully.
Epoch 1/5
4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - loss: 0.4620
Epoch 2/5
4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 636us/step - loss: 0.3263
Epoch 3/5
4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - loss: 0.2322 Epoch 4/5
4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 629us/step - loss: 0.1395 Epoch 5/5
4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 690us/step - loss: 0.1251 Model training completed successfully.

Debugging the Issue Step-by-Step Debugging Process

  1. Environment Setup: Ensured all packages were up-to-date.
    pip install --upgrade pandas tensorflow
  2. Virtual Environment: Created a new virtual environment to isolate dependencies.
    python -m venv myenv
    source myenv/bin/activate
    pip install pandas tensorflow
  3. Set Environment Variables: Set environment variables to control TensorFlow’s logging and device usage.
    export TF_CPP_MIN_LOG_LEVEL=2
  4. Force CPU Usage: Modified the script to force TensorFlow to use only the CPU.
    import os
    os.environ["CUDA_VISIBLE_DEVICES"] = "-1"
  5. Enable TensorFlow Debugging: Enabled TensorFlow debugging to get more insights.
    import tensorflow as tf
    tf.debugging.set_log_device_placement(True)
  6. Disable justMyCode: Disabled justMyCode in the debugger’s launch configuration to step through the code.

Identifying the Problematic Line

Using the debugger, I stepped through the script until it froze. I placed breakpoints and stepped into the function calls until I pinpointed the line where the freeze occurred:

tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name, inputs, attrs, num_outputs)

This line is located in execute.py within the quick_execute function within TensorFlow:

def quick_execute(op_name, num_outputs, inputs, attrs, ctx, name=None):
    """Execute a TensorFlow operation."""
    device_name = ctx.device_name
    try:
        ctx.ensure_initialized()
        tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name, inputs, attrs, num_outputs)
    except core._NotOkStatusException as e:
        if name is not None:
            e.message += " name: " + name
        raise core._status_to_exception(e) from None
    except TypeError as e:
        keras_symbolic_tensors = [x for x in inputs if _is_keras_symbolic_tensor(x)]
        if keras_symbolic_tensors:
            raise core._SymbolicException(
                "Inputs to eager execution function cannot be Keras symbolic "
                "tensors, but found {}".format(keras_symbolic_tensors)
            )
        raise e
    return tensors

Workaround

The quickest workaround I found is to import TensorFlow/Keras before pandas. This order avoids the freezing issue and allows the script to execute normally. I would suggest this solution only if there is a specific requirement for the 2.16.X version.

Solution

After extensive testing and debugging, I found that downgrading TensorFlow to version 2.15.0 resolved the issue. Here are the steps I followed:

  • 1. Tested Different TensorFlow Versions: Downgraded step by step from 2.16.1 until 2.15.0 worked.
    pip install tensorflow==2.15.0

Conclusion

The import order of pandas and TensorFlow/Keras can cause a script to freeze on macOS Sonoma with an Apple M3 Pro chip, possibly due to a memory or lock issue. Importing TensorFlow/Keras before pandas resolves the issue. Additionally, downgrading TensorFlow to version 2.15.0 is an effective solution. If you encounter similar issues, ensure all packages are updated, use a virtual environment, and set appropriate environment variables.