Data Scientist

Acro Service Corp

Lexington, MA

Position Title: Data Scientist (Remote)

Location: Lexington, MA, USA, 02421

Duration: 05 months Contract on W2 (possible extension)

*********************NO C2C******************

Note* Candidates must should have an Active Clearance (secret/top secret, etc.)

Only US Citizen

Position Description:

Designs, develops, and implements methods, processes, and systems to consolidate and analyze diverse data sets including structured and unstructured.
Develops software programs, algorithms, dashboards, information tools, and queries to clean, model, integrate and evaluate datasets. Keeps abreast of new analytic methodologies and technologies.
Collaborates with functional business units to drive business solutions and direction.

Key Responsibilities include but not limited to:

Design, implement, and maintain enterprise-scale search solutions using Apache Solr
Develop and optimize semantic search capabilities using vector embeddings and neural search models
Build custom indexers and indexing pipelines that support vector embeddings alongside traditional text fields
Implement and tune Approximate Nearest Neighbor (ANN) algorithms for efficient similarity search at scale
Design and optimize similarity functions (cosine, dot product, Euclidean) for various search use cases
Build hybrid search systems that combine traditional keyword-based search with vector-based semantic search
Perform traditional relevancy engineering including query analysis, field weighting, boosting strategies, and result tuning
Conduct relevancy analysis using quantitative metrics and qualitative evaluation methods
Monitor search performance metrics and implement continuous improvements
Work cross-functionally with product, engineering, and data teams to define search requirements

Required Qualifications:

5+ years of hands-on experience with Apache Solr or Lucene in production environments
Strong expertise in traditional relevancy engineering including query parsing, field boosting, function queries, and relevance tuning
Proven experience conducting relevancy analysis using both automated metrics and manual evaluation techniques
Strong expertise in vector embeddings and their application to semantic search
Proven experience building hybrid search systems that combine keyword and vector-based approaches
Knowledge of search relevance metrics (NDCG, MRR, precision/recall)
Excellent problem-solving and analytical skills
Strong communication skills and ability to work in collaborative environments

Nice to Have:

Databases and Data Engineering for Big Data
Elasticsearch
Statistical Methods

Clearance:

Candidates should have an active clearance (secret/top secret, etc.) in order to be considered for this position due to the nature of the work being done.

Interview Process: