Skip to content

Commit 1aa6cf1

Browse files
dprodgerclaude
andcommitted
Add per-link manual verification for Spotify duration mismatches
Some Spotify matches that read as duration mismatches are intentional — the wrong-track-on-an-album case where the matcher's automation can't help, and an admin has manually confirmed the link is correct. Without a way to record that judgement, the same row reappears on /admin/duration-mismatches forever and gets re-evaluated by the rematch sweep on every cycle. The matcher already honours match_method='manual' as a "do not touch" flag in update_recording_release_track_id and clear_recording_release_track. This commit adds the missing producer side: a way to flip the flag from the admin UI, and the read-side filtering that makes verified rows disappear from the mismatch reports. Backend: - New db helper integrations.spotify.db.set_track_link_manual_override( conn, streaming_link_id, manual=True/False) flips match_method to 'manual' (or back to 'fuzzy_search' on unverify). No schema change — reuses the existing match_method column. - New admin endpoint POST /admin/duration-mismatches/links/<id>/verify with body {"manual": true|false}. 200 + JSON on success, 404 when the link doesn't exist. Goes through the existing admin gate + CSRF double-submit. - get_songs_with_duration_mismatches and get_releases_with_duration_mismatches in integrations/spotify/db.py now exclude rows where match_method = 'manual'. This is the primary effect: the durable rematch sweep (rematch_spotify_duration_mismatches.py) stops re-querying confirmed matches. - The /admin/duration-mismatches list page and /admin/duration-mismatches/<song> detail page apply the same filter, with an ?include_verified=1 toggle for when an admin needs to review or unverify previously-accepted rows. The list page additionally shows a banner with the count of manually-verified links currently excluded. Admin UI: - New "Status" column on the per-song mismatch table with a "Verify" button per row. After click, the row reloads with a "Verified" badge and an "Unverify" button next to it. Verified rows in include-verified mode get a faded background so they visually recede. - New "Include manually verified mismatches" checkbox above the table on both the list page and the per-song page; preserved across threshold changes via the URL query. Tests: 9 new in test_admin_verify_match.py covering the db helper, the mismatch-query exclusion (both songs and releases helpers), the admin endpoint's success/unauth/404 paths, and the verify/unverify round trip. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent fa7ef9c commit 1aa6cf1

5 files changed

Lines changed: 528 additions & 7 deletions

File tree

backend/integrations/spotify/db.py

Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -261,6 +261,10 @@ def get_releases_with_duration_mismatches(song_id: str, threshold_ms: int = 6000
261261
AND rec.duration_ms IS NOT NULL
262262
AND rrsl.duration_ms IS NOT NULL
263263
AND ABS(rec.duration_ms - rrsl.duration_ms) > %s
264+
-- Skip rows the admin has manually verified — they're
265+
-- intentionally locked to this Spotify track even when the
266+
-- duration would otherwise look mismatched.
267+
AND (rrsl.match_method IS NULL OR rrsl.match_method != 'manual')
264268
"""
265269

266270
params = [song_id, threshold_ms]
@@ -305,6 +309,8 @@ def get_songs_with_duration_mismatches(threshold_ms: int = 60000) -> List[dict]:
305309
WHERE rec.duration_ms IS NOT NULL
306310
AND rrsl.duration_ms IS NOT NULL
307311
AND ABS(rec.duration_ms - rrsl.duration_ms) > %s
312+
-- Skip admin-verified matches; see get_releases_with_duration_mismatches.
313+
AND (rrsl.match_method IS NULL OR rrsl.match_method != 'manual')
308314
ORDER BY s.title
309315
""", (threshold_ms,))
310316
return cur.fetchall()
@@ -624,6 +630,44 @@ def is_track_manual_override(conn, recording_release_id: str, service: str = 'sp
624630
return False
625631

626632

633+
def set_track_link_manual_override(conn, streaming_link_id: str,
634+
manual: bool = True,
635+
log: logging.Logger = None) -> bool:
636+
"""Flip a streaming-link row's match_method to/from 'manual'.
637+
638+
'manual' is the magic value that update_recording_release_track_id
639+
and clear_recording_release_track honour as a "do not touch" flag —
640+
so once an admin marks a (recording, Spotify track) pair as manually
641+
verified, the matcher leaves it alone on every subsequent re-run.
642+
643+
On unverify (manual=False) the method is reset to 'fuzzy_search';
644+
the next matcher pass will re-evaluate from scratch and either
645+
re-confirm or replace it. We don't try to remember the previous
646+
auto-method — there's no schema slot for it and rediscovery is the
647+
correct behaviour.
648+
649+
Returns True if a row was updated, False if no matching row exists.
650+
"""
651+
log = log or logger
652+
new_method = 'manual' if manual else 'fuzzy_search'
653+
with conn.cursor() as cur:
654+
cur.execute(
655+
"""
656+
UPDATE recording_release_streaming_links
657+
SET match_method = %s, updated_at = NOW()
658+
WHERE id = %s AND service = 'spotify'
659+
""",
660+
(new_method, streaming_link_id),
661+
)
662+
if cur.rowcount == 0:
663+
return False
664+
log.info(
665+
"Streaming link %s match_method set to '%s'",
666+
streaming_link_id, new_method,
667+
)
668+
return True
669+
670+
627671
def is_album_manual_override(conn, release_id: str, service: str = 'spotify') -> bool:
628672
"""
629673
Check if an existing album link is a manual override that should be preserved.

backend/routes/admin.py

Lines changed: 70 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,10 @@
2424
from integrations.musicbrainz.parsing import parse_release_data
2525
from integrations.musicbrainz.performer_importer import PerformerImporter
2626
from integrations.musicbrainz.utils import MusicBrainzSearcher
27-
from integrations.spotify.db import is_track_manual_override
27+
from integrations.spotify.db import (
28+
is_track_manual_override,
29+
set_track_link_manual_override,
30+
)
2831
from core.spotify_rematch import (
2932
run_spotify_rematch_for_song,
3033
save_run,
@@ -2694,6 +2697,10 @@ def duration_mismatches_list():
26942697
threshold = request.args.get('threshold', 60, type=int)
26952698
current_sort = request.args.get('sort', 'mismatch_count')
26962699
current_order = request.args.get('order', 'desc')
2700+
# ?include_verified=1 to also count rows the admin has manually
2701+
# verified (match_method='manual'). Default-off so the page is a
2702+
# signal of work-still-to-do, not a rehash of accepted overrides.
2703+
include_verified = request.args.get('include_verified') in ('1', 'true', 'yes')
26972704
threshold_ms = threshold * 1000
26982705

26992706
sort_map = {
@@ -2704,6 +2711,11 @@ def duration_mismatches_list():
27042711
order_col = sort_map.get(current_sort, 'mismatch_count')
27052712
order_dir = 'ASC' if current_order == 'asc' else 'DESC'
27062713

2714+
manual_filter = (
2715+
'' if include_verified
2716+
else "AND (rrsl.match_method IS NULL OR rrsl.match_method != 'manual')"
2717+
)
2718+
27072719
with get_db_connection() as db:
27082720
with db.cursor() as cur:
27092721
cur.execute(f"""
@@ -2723,6 +2735,7 @@ def duration_mismatches_list():
27232735
WHERE r.duration_ms IS NOT NULL
27242736
AND rrsl.duration_ms IS NOT NULL
27252737
AND ABS(r.duration_ms - rrsl.duration_ms) > %s
2738+
{manual_filter}
27262739
GROUP BY s.id, s.title, s.composer
27272740
ORDER BY {order_col} {order_dir}, s.title ASC
27282741
""", (threshold_ms,))
@@ -2739,12 +2752,21 @@ def duration_mismatches_list():
27392752
""")
27402753
total_spotify = cur.fetchone()['cnt']
27412754

2755+
cur.execute("""
2756+
SELECT COUNT(*) AS cnt
2757+
FROM recording_release_streaming_links
2758+
WHERE service = 'spotify' AND match_method = 'manual'
2759+
""")
2760+
total_manual = cur.fetchone()['cnt']
2761+
27422762
total_mismatched = sum(s['mismatch_count'] for s in songs)
27432763

27442764
summary = {
27452765
'total_songs': len(songs),
27462766
'total_mismatched_links': total_mismatched,
27472767
'total_spotify_links': total_spotify,
2768+
'total_manual_links': total_manual,
2769+
'include_verified': include_verified,
27482770
}
27492771

27502772
return render_template('admin/duration_mismatches_list.html',
@@ -2759,8 +2781,14 @@ def duration_mismatches_list():
27592781
def duration_mismatches_review(song_id):
27602782
"""Review duration mismatches for a specific song"""
27612783
threshold = request.args.get('threshold', 60, type=int)
2784+
include_verified = request.args.get('include_verified') in ('1', 'true', 'yes')
27622785
threshold_ms = threshold * 1000
27632786

2787+
manual_filter = (
2788+
'' if include_verified
2789+
else "AND (rrsl.match_method IS NULL OR rrsl.match_method != 'manual')"
2790+
)
2791+
27642792
with get_db_connection() as db:
27652793
with db.cursor() as cur:
27662794
cur.execute("""
@@ -2772,7 +2800,7 @@ def duration_mismatches_review(song_id):
27722800
return "Song not found", 404
27732801
song = dict(song)
27742802

2775-
cur.execute("""
2803+
cur.execute(f"""
27762804
SELECT
27772805
r.id AS recording_id,
27782806
r.title,
@@ -2805,6 +2833,7 @@ def duration_mismatches_review(song_id):
28052833
AND r.duration_ms IS NOT NULL
28062834
AND rrsl.duration_ms IS NOT NULL
28072835
AND ABS(r.duration_ms - rrsl.duration_ms) > %s
2836+
{manual_filter}
28082837
ORDER BY r.recording_year NULLS LAST, rel.title
28092838
""", (song_id, threshold_ms))
28102839
rows = [dict(row) for row in cur.fetchall()]
@@ -2838,6 +2867,7 @@ def duration_mismatches_review(song_id):
28382867
'diff_display': _format_diff(diff_ms),
28392868
'match_confidence': float(row['match_confidence']) if row['match_confidence'] is not None else None,
28402869
'match_method': row['match_method'],
2870+
'is_manual': row['match_method'] == 'manual',
28412871
'release_title': row['release_title'],
28422872
'artist_credit': row['artist_credit'],
28432873
'release_mb_id': row['release_mb_id'],
@@ -2851,7 +2881,8 @@ def duration_mismatches_review(song_id):
28512881
return render_template('admin/duration_mismatches_review.html',
28522882
song=song,
28532883
recordings=recordings,
2854-
threshold=threshold)
2884+
threshold=threshold,
2885+
include_verified=include_verified)
28552886

28562887

28572888
@admin_bp.route('/duration-mismatches/delete-links', methods=['POST'])
@@ -2887,6 +2918,42 @@ def duration_mismatches_delete():
28872918
return jsonify({'error': str(e)}), 500
28882919

28892920

2921+
@admin_bp.route('/duration-mismatches/links/<link_id>/verify', methods=['POST'])
2922+
def duration_mismatches_verify_link(link_id):
2923+
"""Mark a Spotify streaming link as manually verified (or un-verify it).
2924+
2925+
Flips `recording_release_streaming_links.match_method` to 'manual',
2926+
which is the magic value the matcher already honours as a "do not
2927+
touch" flag in update_recording_release_track_id and
2928+
clear_recording_release_track. The duration-mismatch admin queries
2929+
also exclude rows with match_method='manual' so the UI stops nagging
2930+
about a match the admin has already accepted.
2931+
2932+
Body: optional JSON {"manual": false} to revert. Default is to set
2933+
manual=true.
2934+
"""
2935+
data = request.get_json(silent=True) or {}
2936+
set_manual = bool(data.get('manual', True))
2937+
try:
2938+
with get_db_connection() as db:
2939+
updated = set_track_link_manual_override(
2940+
db, link_id, manual=set_manual, log=logger,
2941+
)
2942+
db.commit()
2943+
except Exception as e:
2944+
logger.error(f"Error toggling manual override on link {link_id}: {e}")
2945+
return jsonify({'error': str(e)}), 500
2946+
2947+
if not updated:
2948+
return jsonify({'error': 'Streaming link not found'}), 404
2949+
2950+
return jsonify({
2951+
'success': True,
2952+
'link_id': link_id,
2953+
'match_method': 'manual' if set_manual else 'fuzzy_search',
2954+
})
2955+
2956+
28902957
@admin_bp.route('/users')
28912958
def users_list():
28922959
"""List user accounts with email search and pagination."""

backend/templates/admin/duration_mismatches_list.html

Lines changed: 17 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -231,9 +231,25 @@ <h1>Spotify Duration Mismatches</h1>
231231
<option value="desc" {% if current_order == 'desc' %}selected{% endif %}>Descending</option>
232232
</select>
233233
</div>
234+
<div class="filter-group">
235+
<label>&nbsp;</label>
236+
<label style="font-weight: normal; display: flex; align-items: center; gap: 6px;">
237+
<input type="checkbox" name="include_verified" value="1"
238+
{% if summary.include_verified %}checked{% endif %}>
239+
Include verified matches
240+
</label>
241+
</div>
234242
<button type="submit" class="apply-btn">Apply</button>
235243
</form>
236244

245+
{% if summary.total_manual_links %}
246+
<p style="font-size: 13px; color: #666; margin-top: -8px;">
247+
{{ summary.total_manual_links }} Spotify link(s) are manually
248+
verified and excluded from this report
249+
{% if summary.include_verified %}<em>(currently included)</em>{% else %}— check the box above to include them{% endif %}.
250+
</p>
251+
{% endif %}
252+
237253
<!-- Results Table -->
238254
{% if songs %}
239255
<table>
@@ -249,7 +265,7 @@ <h1>Spotify Duration Mismatches</h1>
249265
{% for song in songs %}
250266
<tr>
251267
<td>
252-
<a href="{{ admin_url('/admin/duration-mismatches/') }}{{ song.song_id }}?threshold={{ threshold }}" class="song-link">
268+
<a href="{{ admin_url('/admin/duration-mismatches/') }}{{ song.song_id }}?threshold={{ threshold }}{% if summary.include_verified %}&include_verified=1{% endif %}" class="song-link">
253269
<div class="song-title">{{ song.title }}</div>
254270
{% if song.composer %}
255271
<div class="composer">{{ song.composer }}</div>

0 commit comments

Comments
 (0)