Issues: internetarchive/heritrix3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Text versions of DNS should be recorded as WARC-Type resource instead of response
bug
#568
opened Oct 12, 2023 by
ato
Redirect field in seeds-report.txt is only populated for status 301 and 302
bug
#564
opened Aug 20, 2023 by
ato
Provided seed files are updated (the more the job is repited, the more they are modified)
#558
opened Apr 25, 2023 by
cgr71ii
Maven build fails due to HTTP only upstream servers
archive.org
archive.org services not (just) Heritrix
#481
opened May 2, 2022 by
Jauchi
Commas in srcset-URLs are not handled correctly
archive.org
archive.org services not (just) Heritrix
#458
opened Jan 15, 2022 by
grob
Crawl job stats and reports misleading when excluding PDF-Files (follow up to issue #453)
bug
#455
opened Dec 20, 2021 by
oschihin
Allow plain HTTP console access (as a non-default option)
feature request
#440
opened Sep 30, 2021 by
anjackson
Better handling of extracted URIs that are "data URIs" (base64 encoded media)
#422
opened Jul 29, 2021 by
kris-sigur
Disk usage is not within je.maxDisk or je.freeDisk limits and write operations are prohibited
#340
opened Jul 16, 2020 by
tchnlgst
Torrents created from very large collections by ia_make_torrent are truncated
archive.org
archive.org services not (just) Heritrix
#321
opened Apr 3, 2020 by
khimaros
Web UI on non-https doesn't respond sensibly
feature request
pull request welcome
#318
opened Mar 26, 2020 by
pegleGrot
Previous Next
ProTip!
Add no:assignee to see everything that’s not assigned.