-
Notifications
You must be signed in to change notification settings - Fork 615
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error AttributeError: 'implicit.evaluation._memoryviewslice' object has no attribute 'dtype'
when calling mean_average_precision_at_k
function
#726
Comments
Hi @MRossa157 ! I have trained model with python implicit package and faced the same problem: The minimum example to reproduce the error import os
import random
import pandas as pd
from scipy.sparse import csr_matrix
from implicit.evaluation import train_test_split, ndcg_at_k, mean_average_precision_at_k
from implicit.gpu.als import AlternatingLeastSquares
os.environ['OPENBLAS_NUM_THREADS']="1"
os.environ['CUDA_VISIBLE_DEVICES']="0"
# init random data
n_actions = 100000
max_uid = 100000
max_action_id = 10000
df = pd.DataFrame(data={
"user_id" : [random.randint(1, max_uid) for i in range(0, n_actions)],
"action" : [random.randint(1, max_action_id) for i in range(0, n_actions)],
"impression" : [1 for i in range(0, n_actions)]
})
# convert to sparse format
user_rows = [uid for uid in df.user_id.tolist()]
query_cols = [st for st in df.action.tolist()]
qvecs = csr_matrix((df.impression, (user_rows, query_cols)))
# train test split and model training
train_user_items, test_user_items = train_test_split(qvecs, train_percentage=0.9, random_state=19)
model = AlternatingLeastSquares(factors=130, regularization=0.05, alpha=1.0, calculate_training_loss=True)
model.fit(train_user_items)
# calculate ndcg
ndcg = ndcg_at_k(model, train_user_items, test_user_items, K=14, show_progress=True, num_threads=1) packages version: os: |
Updating to scipy 1.14.1 should resolve the issue. |
It does not in the wheels, at least on my side. Did you compile from scratch? |
Update: Python workaround to perform the evaluation "manually" def ranking_metrics_at_k(model, train_user_items, test_user_items, K=10, show_progress=True):
"""
Calculates ranking metrics (Precision@K, MAP@K, NDCG@K, AUC) for a trained model.
Parameters:
model : Trained ALS model (or other Implicit model).
train_user_items : csr_matrix
User-item interaction matrix used for training.
test_user_items : csr_matrix
User-item interaction matrix for evaluation.
K : int
Number of items to evaluate.
show_progress : bool
Show a progress bar during evaluation.
Returns:
dict : Dictionary with precision, MAP, NDCG, and AUC scores.
"""
# Ensure matrices are in CSR format
train_user_items = train_user_items.tocsr()
test_user_items = test_user_items.tocsr()
num_users, num_items = test_user_items.shape
relevant = 0
total_precision_div = 0
total_map = 0
total_ndcg = 0
total_auc = 0
total_users = 0
# Compute cumulative gain for NDCG normalization
cg = 1.0 / np.log2(np.arange(2, K + 2)) # Discount factor
cg_sum = np.cumsum(cg) # Ideal DCG normalization
# Get users with at least one item in the test set
users_with_test_data = np.where(np.diff(test_user_items.indptr) > 0)[0]
# Progress bar
progress = tqdm.tqdm(total=len(users_with_test_data), disable=not show_progress)
batch_size = 1000
start_idx = 0
while start_idx < len(users_with_test_data):
batch_users = users_with_test_data[start_idx:start_idx + batch_size]
recommended_items, _ = model.recommend(batch_users, train_user_items[batch_users], N=K)
start_idx += batch_size
for user_idx, user_id in enumerate(batch_users):
test_items = set(test_user_items.indices[test_user_items.indptr[user_id]:test_user_items.indptr[user_id + 1]])
if not test_items:
continue # Skip users without test data
num_relevant = len(test_items)
total_precision_div += min(K, num_relevant)
ap = 0
hit_count = 0
auc = 0
idcg = cg_sum[min(K, num_relevant) - 1] # Ideal Discounted Cumulative Gain (IDCG)
num_negative = num_items - num_relevant
for rank, item in enumerate(recommended_items[user_idx]):
if item in test_items:
relevant += 1
hit_count += 1
ap += hit_count / (rank + 1)
total_ndcg += cg[rank] / idcg
else:
auc += hit_count # Accumulate hits for AUC calculation
auc += ((hit_count + num_relevant) / 2.0) * (num_negative - (K - hit_count))
total_map += ap / min(K, num_relevant)
total_auc += auc / (num_relevant * num_negative)
total_users += 1
progress.update(len(batch_users))
progress.close()
# Compute final metrics
precision = relevant / total_precision_div if total_precision_div > 0 else 0
mean_ap = total_map / total_users if total_users > 0 else 0
mean_ndcg = total_ndcg / total_users if total_users > 0 else 0
mean_auc = total_auc / total_users if total_users > 0 else 0
return {
"precision": precision,
"map": mean_ap,
"ndcg": mean_ndcg,
"auc": mean_auc
} |
Hello! I encountered an issue when using the
mean_average_precision_at_k
function from theimplicit
library.Problem Description:
When calling the
mean_average_precision_at_k
function, the following error occurs:Context:
implicit
library version: 0.7.2Steps to Reproduce:
implicit
library version 0.7.2.mean_average_precision_at_k
function with the following parameters:Expected Behavior:
The function should return the MAP@K metric value without errors.
Additional Information:
I would appreciate any assistance in resolving this issue.
The text was updated successfully, but these errors were encountered: