LEDE Table of Packages: Good or bad?

@chris5560 In https://lede-project.org/packages/pkgdata/ddns-scripts, can we have a listing of supported ddns providers in the package description?

Good example for providing useful information in the description: https://lede-project.org/packages/pkgdata/luci-i18n-base-lang

@bobafetthotmail I think it wasn't that clever an idea to link Maintainer_pkg-maintainer to the contact page. We have 192 maintainers, and 138 packages w/o maintainer. Only few are listed on the contact page, and none with real contact data (email) behind.

What do you think of instead making the Maintainer a clickable filter for a datatable?

-> Click on maintainer -> all packages of this mainatiner will be shown in the table.

To change this: Remove the | in front of the Maintainer name.

I think it wasn't that clever an idea to link Maintainer_pkg-maintainer to the contact page.

-cough-(I told you)-cough-

What do you think of instead making the Maintainer a clickable filter for a datatable?

Sure.

There is already a very good link inside package description. Why to update multiple pages with duplicate information and the risk to run out of sync.
Something like that link should be part of/available for every package description because it makes it easy for a user when looking into LuCI package install.
It (should) give full information about the package and it's configuration.
Additional there could be a README.md file with base information inside every packages directory (like adblock package does)

[quote="tmomas, post:141, topic:123, full:true"]@chris5560 In https://lede-project.org/packages/pkgdata/ddns-scripts, can we have a listing of supported ddns providers in the package description?[/quote]Btw, it's easy for me to add another special function to parse the list of supported packages from the list in the source, but the plan was to have a page accessed by clicking on the package name, where you find full info and tutorials and whatever. (that in this case would be a copy-paste of the one linked in the description)

done another upload, running the indexer.
It seems that dokuwiki does not like txt files with uppercase letters, around 50 packages were not even loaded because of that :unamused:
https://lede-project.org/packages/start
With this upload all packages are shown with a correct category/submenu.

please ignore this part.
Indexes were also auto-generated as requested and I put them in the right place (/var/www/dokuwiki/data/pages/packages/index), but they are not shown in the wiki, I don't know why. https://lede-project.org/packages/index/start
They are generated correctly and if I copy-paste them in a wiki page they work fine in the Preview.

EDIT: hmmm, they don't seem to be detected by the indexer, like if that folder was not even watched.

EDIT2: no, it's my mistake, they don't end with *.txt

EDIT3: amazing, the wiki hates also "(" and ")", so I'm getting multiple views of freeradius-(versionx) indexes. :expressionless: Will have to add a filter for that too.

The restrictions for pagenames: https://www.dokuwiki.org/pagename

...and you probably left a temp folder: /var/www/dokuwiki/data/pages/var/www/dokuwiki/data/pages/packages/index/

Edit: https://lede-project.org/packages/index/start wasnt there, so I re-created it from an old version.

Edit II: I see, you named the index page "index" -> should be "start" instead.
If you go to packages/index/, dokuwiki will search for a "start" page.

When you do a wiki search for "packages", this results in a list of all packages.
Search for "network" results in a list mostly consisting of packages.

I feel this spams the search function and makes it less usable.
We have the packages/start and packages/index/start to search for packages, therefore we do not need to have the packages in the wiki search.

To achieve this, please add "HIDEPAGE" as the first line in each package dataentry.

Another observation: "cols" should be set to max. 4, since everything above shows bad results (content of the last columns disappears) when windowsize is <1200px.

EDIT: Example -> https://lede-project.org/packages/index/network---telephony

Apart from the above remarks: Thank you Alberto for this already brilliant work! :slight_smile:

done another upload, this time everything seems to be OK from the start (the datacloud is taking ages to update itself, as usual).

I found a way to update stuff from CLI while avoiding to leave ghosts in the database, without calling a re-index of everything. The following is the code that deletes all package entries (similar code deletes the index entries).

for file_name in $( sudo ls /var/www/dokuwiki/data/pages/packages/pkgdata/  ) ; do

echo deleting package "$file_name"
sudo /var/www/dokuwiki/bin/dwpage.php -u www-data checkout packages:pkgdata:$(echo "$file_name" | sed 's/.txt$//')
rm "$file_name"
touch "$file_name"
sudo /var/www/dokuwiki/bin/dwpage.php -u www-data  commit -m "deleted" "$file_name" packages:pkgdata:$(echo "$file_name" | sed 's/.txt$//')
rm "$file_name"
done

It uses a dokuwiki CLI interface (dwpage.php) and basically takes each article telling the wiki that it is "open for reading", then erases its contents, then tells the wiki "file modified" so the wiki will actually go and delete it from both data folder and from the database.

It's kinda slow and CPU-intensive though, but I can't do much on that, it's already doing one file at a time to avoid murdering the server, and it needs around 3-4 seconds per file.

After that has run, it copies over the new .txt files and call a reindex (not a full reindex) to detect them.

Now, I would like to set up the actual indexing script on the server, to run on a schedule and do the ungrateful job of keeping package lists updated.

Can I set it as a cron job? Do I leave it and its work folders in my home folder on the server?

EDIT: I'm also adding a filter to remove the dependency on "libc" that is pointless for the package table.

You forgot to clean up the ownership after your script has run.
If owner = root, strange things can happen (I encountered two error messages when opening a package dataentry).

Owner + group of all files below /dokuwiki should be www-data.
chown -R is your friend.

As for your process of updating the dataentries: It is really way faster to just replace the dataentry pages, run the indexer, then remove any debris in the database.

number of package dataentries: 4000

Your method: 0,25/sec (4sec/dataentry) -> 16000sec -> roughly 5h

Conventional method:
copying dataentries: 100/sec -> 40sec
indexing: 4/sec -> 1000sec
database cleanup: 100/sec -> 40sec

Total: 1080sec -> 0,3h

But I see also your motivation: Cleaning up any debris in the database via script, not via the dokuwiki admin area.

Hmmm... tricky.
I'll do some further research on how to cleanup the DB via script. Maybe we can get this to work anyhow. 0,3h vs. 5h, that's too tempting to ignore that.

That's where we need @thess and @jow advice. I'm in the same situation (script for updating https://lede-project.org/toh/views/fwfiles_vs_dataentries_searchtable needs to be run regularly)

cron would also be my choice, but where to place the scripts? Any rules for adding cronjobs, e.g. add comments regarding purpose/maintainer of the script?

You forgot to clean up the ownership after your script has run.

Actually, I did that. I have a chmod in the package-loading script (in my home, see "loadpackage-gz" script)
Hmm, maybe I need to run the chmod after I ran the re-indexing as root?

I had similar issues with dwpage.php, but that tool allowed to specify a user to run as, the indexer does not, it likely changes permissions on files it indexes to root or something.

16000sec -> roughly 5h

More like 2.5 - 3 hours of real time actually. It's faster to run a full reindex than doing it like this.

Exactly, chown after indexer.php. And not only chown on the actual data (dokuwiki/data/pages/packages/), but also chown

/dokuwiki/data/meta
/dokuwiki/data/index

Do a find . -user root before and after your script to see which folders need to be chown'ed.

Then I'd say: If it's faster, go for a full re-indexing, although I don't like to put unnecessary load on the server due to reindexing lots of pages that have not really been modified (~1200 dataentries for supported devices, number still growing). Would be good if the script then runs at low traffic times (approx. 2300-0100 server time -> https://lede-project.org/stats/#hours)

I forgot: You can also try to run the indexer with sudo -u www-data

Depending on the permissions I usually place cronjob scripts in either /root or my personal home directory. As long as the name is self-explaining I do not think that you need any extra comments.

Quick update: I've been running some tests on the server lately.

The script is running in full-ninja mode (CPU priority and disk access priority lower than any other process on the server)

/usr/bin/renice -n 20 $$
/usr/bin/ionice -c2 -n7 -p$$ 

And I confirmed this by simulating real load on the server by simply refreshing 5 different pages of the table of packages in my web browser (instant CPU at 100% from htop). My script's execution is paused until the server has finished its main job.

I ran a full package txt files creation and it took around a full day to finish (I was thinking it would have taken much more than this), since there is no impact on server performance it should be fine.

Will run some more tests where it also updates the packages in the wiki, then I can add it as a chron job.

Good, all tests went fine. I set up the script to print errors and/or logs in a wiki page accessible from maintainer view of ToP https://lede-project.org/packages/logfile

It currently prints errors about packages that were dropped in the source, but are still there in some archs/package_archs as the build bots never updated them (I think I already said so in the mailing list some time ago). This is a buildbot issue that does not affect much the ToP, but you can see how errors will look like.

This weekend will do some minor editing to the logging logic and then finally add the script as a cron job.

Nice idea, to renice/ionice the ToP creation, and also to put the logfile in the wiki.
Thanks for your excellent work so far!

@bobafetthotmail The recent changes shows all packages as "deleted", but the package pages are in fact not deleted, but only got changed.

I don't think this is intended behaviour. Can you please look into this?

Hm, that's because I'm using dwpage.php to delete them (and add the commit message "deleted" as it whines if I don't), but to add them I'm running the indexer.php that is much faster than dwpage.php. It seems it did not set a commit message (and I have no way to ask it to make one).

If you look at a random page that was deleted https://lede-project.org/packages/index/network---olsr.org-network-framework?do=revisions , you see that it has a "(external edit)" on it, but that's not a commit, I don't know why.

I think that the older "external edit" commit there was done when I was running the indexer as root then chowning to www-data all website folders afterwards.

I can try that again, or switch to using dwpage.php for loading packages too but I'd rather avoid that as it's slow.

EDIT: it might make more sense to not have the package data files deletions logged in the wiki, as it's pretty large logspam every sunday. Will also try if I get that to work.