Skip to main content
S

Data Scientist

SLB

Location

Houston, TX

Salary

Not specified

Type

fulltime

Posted

Today

via linkedin

Job Description

Job Description

Build, train, and deploy large-scale, self-supervised "foundation" models that learn rich representations of time series, sequential sensor data in addition to textual and vision data, to be fine-tuned for tasks such as anomaly/event detection, predictive maintenance, forecasting, classification, or multi-modal sensor fusion for industrial and scientific applications.

Data/Signal Processing

  • Time Series \& Sequential Data: processing, augmentation, feature engineering for financial, industrial, IoT, medical, or other sensor streams (univariate/multivariate time series).
  • Sensor Data Analysis: expertise with diverse sensor modalities (e.g., accelerometers, temperature, vibration, audio, images), sampling rates, synchronization, and real-world noise/artifact handling.
  • Multi-Modality Learning: integrating heterogeneous data types (time series, images, text, audio, structured) into robust deep learning architectures; cross-modal representation learning.

Machine Learning \& Foundation Model Expertise

  • Self-supervised and Semi-supervised Learning: time series foundation models, masked modeling, contrastive methods, temporal predictive coding, multimodal alignment and fusion.
  • Model Architectures: sequence models (RNNs, GRU/LSTM, TCN), 1D/2D/3D CNNs, Transformers (BERT, ViT, TimeSFormer), graph neural networks, diffusion/generative models, multi-modal/fusion encoders.
  • Transfer Learning \& Fine-Tuning at Scale: prompt/adapter-based strategies, temporal domain adaptation, few-shot learning for specialized tasks.
  • Evaluation Metrics: regression/classification (MSE, F1, AUC), time series similarity (DTW, correlation), event detection/segmentation (IoU, accuracy), business/end-user KPIs.

Software \& Infrastructure

  • Programming: expert Python (NumPy, SciPy, Pandas), C\+\+/CUDA for custom kernels and high-performance preprocessing.
  • Deep Learning Frameworks: PyTorch (Lightning, Distributed), TensorFlow/Keras, JAX/Flax.
  • Large-scale Training: multi-GPU, multi-node clusters, mixed-precision, ZeRO optimization, scalable data loaders for long sequences.
  • Data Engineering: robust pipelines for ingesting, cleaning, segmenting, and aligning large-scale, time-synchronized multi-sensor datasets.

Mathematical \& Algorithmic Foundations

  • Linear Algebra, Probability \& Statistics, Optimization (stochastic, convex/non-convex, Bayesian).
  • Signal Processing: Fourier/wavelet analysis, filters (Kalman, Savitzky–Golay), resampling, noise modeling.
  • Numerical Methods: ODE/PDE solvers, inverse problems, regularization, time-frequency methods for complex systems.

Collaboration \& Communication

]], \>

  • Cross-disciplinary teamwork with domain experts, engineers, product owners, and end-users from industrial, scientific, or medical backgrounds.
  • Clear presentation of complex model behaviors (interpretability, attention analysis), uncertainty quantification, and value impact.
  • MS / Ph.D. in computer science, data science and AI or related fields.
  • 3\+ years of relevant experience in data science and AI or related fields.

Looking for more opportunities?

Browse thousands of graduate jobs and entry-level positions.

Browse All Jobs