Fedora summer-coding Planet

Week eleven: Summer of coding report

Posted by squimrel on August 16, 2017 12:33 AM

Persistent storage works with UEFI

Most of the work I’m doing has to do with byte fiddling since I want to develop cross-platform solutions in areas where no cross-platform libraries are available.

It might be better to develop a library for things like this but the use-cases I’m working on are so tiny that for me it’s just not worth it to write a library for them.

A quick overview of things I needed to do manually because I didn’t find a “cross-platform” C or C++ library that does them. Keep in mind that I might have done crappy research.

  • Modify a file on an ISO 9660 file system.
  • Add a partition to an MBR partition table.
  • Create a FAT file system.
  • Modify a file on a FAT file system.

It’s pretty sad that after so many years no one has found the time to write libraries for these simple tasks. Maybe it’s just not the most fun thing to do.

I solved all these using byte fiddling. Except for ISO 9660 I started to write a library that could be extend in the future to be able to do more than modify a file on it. In particular it should be noted that ISO 9660 is a read-only file system and therefore not meant to be modified.

efiboot.img byte fiddling

Making persistent storage work on UEFI is just a matter of adding a couple switches to the grub.cfg which is inside a FAT file system that is stored inside of an ISO 9660 file system as a file called efiboot.img.

UEFI uses this file because there’s an EFI partition that starts at at the same block as the efiboot.img file content.

I had to figure out where the position of the grub.cfg inside the efiboot.img is, add the corresponding switches and tell the FAT files system the new file size of the grub.cfg file.

Mac

Booting into the live system with persistent storage enabled on Mac does not work because for that I’d have to modify a file on an HFS+ file system and HFS+ which I haven’t done yet and there’s no cross-platform library for this as always.

What’s next

Since this is the last week before final evaluations I’ll spend the rest of the time debugging the application on Windows and therefore I’ll not add the missing switch to the HFS+ file system this summer.

Building for Windows works but it crashes for reasons I don’t yet know. I don’t have much experience debugging on Windows so this will be fun -.-. I’ve got gdb peda ❤ running on my Windows VM so I’ll most likely be just fine.

Performing a complete DB dump in LDIF format (1/3)

Posted by Ilias Stamatis on August 12, 2017 12:34 PM

Let’s dive a little bit deeper into the Directory Server’s internals this time. As I mentioned in my previous post I have started working lately on ticket #47567 and here I’m going to explain what it is about. I’ll split this post in 3 parts for easier reading. I’ll first explain some key concepts about how data are actually stored in Directory Server, and then I’ll talk about what we want to achieve.

Database backend & Berkeley DB

The database backend of Directory Server is implemented as a layer above the Berkeley DB storage manager (BDB). Berkeley DB takes care of lower level functions such as maintaining the B-Trees, transaction logging, page pool management, recovery and page-level locking. The sever backend on the other hand handles higher level functions such as indexing, query optimization, query execution, bulk load, archive and restore, entry caching and record-level locking. Berkeley DB is not a relational database. It stores data in a key / value format.

DB files & dbscan

The BDB files used by DS can be found under /var/lib/dirsrv/slapd-instance/db/backend.

For example here are the database files of the userRoot backend for a DS instance named localhost:

[root@fedorapc db]# ls -l /opt/dirsrv/var/lib/dirsrv/slapd-localhost/db/userRoot
total 276
-rw-------. 1 dirsrv dirsrv 16384 Jul 31 00:47 aci.db
-rw-------. 1 dirsrv dirsrv 16384 Aug 4 17:01 ancestorid.db
-rw-------. 1 dirsrv dirsrv 16384 Jul 31 00:47 cn.db
-rw-------. 1 dirsrv dirsrv 51 Aug 10 14:03 DBVERSION
-rw-------. 1 dirsrv dirsrv 16384 Aug 4 17:01 entryrdn.db
-rw-------. 1 dirsrv dirsrv 16384 Aug 4 17:01 entryusn.db
-rw-------. 1 dirsrv dirsrv 16384 Aug 4 17:01 id2entry.db
-rw-------. 1 dirsrv dirsrv 16384 Aug 2 16:22 member.db
-rw-------. 1 dirsrv dirsrv 16384 Jul 31 00:47 memberOf.db
-rw-------. 1 dirsrv dirsrv 16384 Jul 31 00:47 nsuniqueid.db
-rw-------. 1 dirsrv dirsrv 16384 Aug 4 17:01 numsubordinates.db
-rw-------. 1 dirsrv dirsrv 16384 Aug 2 15:48 objectclass.db
-rw-------. 1 dirsrv dirsrv 16384 Aug 4 17:01 owner.db
-rw-------. 1 dirsrv dirsrv 16384 Aug 4 17:01 parentid.db
-rw-------. 1 dirsrv dirsrv 16384 Aug 4 17:01 seeAlso.db
-rw-------. 1 dirsrv dirsrv 16384 Jul 31 00:47 sn.db
-rw-------. 1 dirsrv dirsrv 16384 Aug 4 17:01 uid.db
-rw-------. 1 dirsrv dirsrv 16384 Aug 4 17:01 uniquemember.db

At this point, let’s introduce dbscan. dbscan is a command line tool (provided by DS) that scans a database file and dumps its contents.

DB files: index files

Now, most of the above listed files are index files, used to speed up DS searches. For example sn.db is the index file used for the “sn” attribute (surname). We can see its contents by using dbscan:

[root@fedorapc userRoot]# dbscan -f sn.db 
*^st 
*^su 
*ama 
*ame 
*ati 
*is$ 
*mat 
*me$ 
*nam 
*rna 
*sta 
*sur 
*tam 
*tis 
*urn 
+ 
=stamatis 
=surname

Its output is not that lengthy because only 2 sn attributes exist in the database so far. In a similar way, cn.db is the index file for the cn attribute (common name) etc.

DB files: id2entry.db

Let’s now see a very important file; id2entry.db. Here is where the actual data are stored. Every single directory entry is stored as key – value record in this file, by having its ID used as the key, and the actual data as the value.

One useful feature of dbscan is that it accepts many options. If we do “dbscan -f id2entry.db”, dbscan will list all directory entries. Instead, we can just display a single entry if we wish by using the -K option.

So, to print entry with ID 3:

[root@fedorapc userRoot]# dbscan -K 3 -f id2entry.db
id 3
    rdn: ou=Groups
    objectClass: top
    objectClass: organizationalunit
    ou: Groups
    nsUniqueId: 88c7da02-6b3f11e7-a286de5a-31abe958
    creatorsName:
    modifiersName:
    createTimestamp: 20170717223028Z
    modifyTimestamp: 20170717223028Z
    parentid: 1
    entryid: 3
    numSubordinates: 5

Important things to notice:

  • Operational attributes are displayed as well.
  • The DN of an entry is not stored in the database, only its RDN.
  • Every entry (except for the root entry), has a parentid attribute wich links to its parent.
  • There’s a numSubordinates attribute, indicating how many children this entry has. If this attribute is absent, it means that the entry has no children.

We will see why these observations matter later. For now, let’s continue with another database file.

DB files: entryrdn.db

Lastly, let’s take a look at entryrdn.db. I’ll not list the whole output, but just a short part instead, in order to understand how we can read this file.

...
3
 ID: 3; RDN: "ou=Groups"; NRDN: "ou=groups"
C3
 ID: 6; RDN: "cn=Accounting Managers"; NRDN: "cn=accounting managers"
6
 ID: 6; RDN: "cn=Accounting Managers"; NRDN: "cn=accounting managers"
P6
 ID: 3; RDN: "ou=Groups"; NRDN: "ou=groups"
C3
 ID: 7; RDN: "cn=HR Managers"; NRDN: "cn=hr managers"
7
 ID: 7; RDN: "cn=HR Managers"; NRDN: "cn=hr managers"
...

Every line displays the ID of the entry, the RDN of the entry, and the normalized RDN (all lowercase, no extra spaces etc.) of the entry. Additionally, we can see some more information. The 3 in the first line it means that the following entry is the one with id 3. C3 means that the following entry is a child of the entry with id 3. P6 means that the following entry is the parent of the entry with id 6. So in this way we can interpret the whole file.

In the second part of this post we will discuss about what we really want to achieve here; perform a complete database dump in valid LDIF format.

 


Week ten: Summer of coding report

Posted by squimrel on August 08, 2017 11:13 PM

The picture below includes all the UI changes that were made:

<figure></figure>

The portable media device which is made bootable using the MediaWriter does not support persistent storage if UEFI is enabled on the target computer or if the target computer is a Mac.

To enable support for those systems the overlay switch has to be added in a couple more files which are inside of .img files.

To be specific the efiboot.img is a FAT file system and the macboot.img is an HFS+ file system. On those two file systems the boot.cfg files need to be modified in-place. To do that the corresponding file system needs to be told the new size of that file.

Let’s see how far we’ll get until next week.

Another GSoC Update

Posted by Ilias Stamatis on August 08, 2017 09:00 AM

A small update on what I have been doing lately on the project.

Here are some tickets that I have worked on in the C code base:

  • #48185 referint-logchanges is not implemented in referint plugin
  • #49309 Referential Integrity does not perform proper syntax checking on referint-update-delay
  • #49329 Provide descriptive error message when neither suffix nor backend is specified for USN cleanup task
  • #49315 Add warning on startup if unauthenticated binds are enabled.

And here’s a few on the lib389 side:

  • #46 dsconf support for schema reload plugin — This one allows us to dynamically reload the schema while the server is running.
  • #45 dsconf support for rootdn access — With this plugin we enforce access control on the Directory Manger.

Since last week I have been working on another issue: https://pagure.io/389-ds-base/issue/47567

This last one is by far the most interesting ticket I have been working on this summer, since it has helped me to understand a lot of the server internals. It’s also a tricky one so it will take some time. As well, it has led to some interesting discussions in the list. In my next blog post I’ll write more details on this and I’ll explain what it is about.

Happy coding!


GSoC: Report of bug fixes

Posted by David Carlos on August 05, 2017 05:00 PM

This post is just a report of some bug fixes, done on the last released version of kiskadee. The version 0.2.3 , is the last release before the development of our API, and we decided to release it now, because some of the issues that we had fixed could disrupt the API development. The list of issues that we have fixed are:

  • #18 : Use the download method, inside kiskadee.util, to download stuff from the Internet.
  • #25 : sqlalchemy crash when more than two plugins are active.
  • #31 : In some analysis, the flawfinder parser gets into a infinite loop.
  • #32 : The temporary directory created by Docker, is not been removed.
  • #33 : Rename the plugin package to fetcher.
  • #35 : Execute runner and monitor as separate processes.
  • #37 : Anitya fails to transform a fedmsg event, on a python dictionary.
  • #38 : The Docker sdk for python cut the analysis results, when the result is too long.

Of these issues, what matters most is the #35 and #38. With the implementation of the issue #35, now the monitor and the runner component, runs in separate processes, allowing a better use of the resources of the OS. Being a process, now the runner component can run each analyzer concurrently, instead of sequentially, what will increase the speed with which kiskadee run the analysis. The issue #38 was a a bug that we found on the Docker sdk for python. When the output of a static analyzer was too long, the Docker sdk was cutting of the analysis result, and we were saving a incomplete analysis on the database. This was causing the #31 bug, because the flawfinder parser was not being able to parse a incomplete analysis.

Now we will start the kiskadee API, that must be released on version 0.3.0. With the API, we will be able to make available all the analysis done by kiskadee in a standard way, allowing other tools to interact with our database.

Week nine: Summer of coding report

Posted by squimrel on August 02, 2017 01:31 AM

Overlay partition works. GUI stuff pending. EFI coming up.

The PR I’m working on can now create a partition with a FAT32 file system and an overlay file on POSIX compliant systems.

The FAT32 file system is well known for the 4GB maximum file size limitation. Therefore any devices that have a size greater than (iso image size + 4GB) will also only store a 4GB overlay file. Therefore the rest of the additional FAT32 file system may be used by the end user.
Note that some systems like Mac will not be able to mount that additional FAT32 partition because the partition entry was only added to the MBR partition table and the isohybrid layout also deploys a GPT partition table and an Apple partition table which will be preferred by some systems which therefore can’t see that partition.

Tomorrow I’ll get the UI display that the overlay partition is being written. There have also been thoughts on making the user able to skip the write check but probably I’ll come back to that once EFI boot works.

More information regarding to EFI will be incoming next week.

GSoC2017 (Fedora) — Week 5-8

Posted by Mandy Wang on July 27, 2017 04:31 PM

I continued to do the work about the migration of the Plinth. I summarized the packages which are needed by Plinth and their alternates in Fedora.

Sometimes there will be a little different in details between the packages which have the similar name in Fedora and Debian, I’ll check them one by one and find better solutions.

About libjs-bootstrap and libjs-modernizr, I can’t find a suitable alternate, so I extracted the .deb package in Fedora and put them under the javascript/.

As you know, most foreign websites are blocked in the Chinese mainland, which include all of the Google services, many frequently-used IM software, Wikipedia and so on, so we have to use VPN or other ways to connect to the servers located abroad to load these websites. But lots of VPN (include mine) in China are blocked unexpectedly in July because of some political reasons, I can’t update my blogs and codes until I find a new way to “over the wall”, so my work is delayed this moth.

And because my mentor also lives in China and he refuses to use the non-free software in China, we use gmail and telegram to communicate in normal times, so I losed contact with my mentor for many days. As soon as we are able to contact with each other in the last few days, I told him the work I did this month and the question I met. For this case, we are thinking about scaling down our project temporarily and delaying some unimportant work, I will puts forward concrete solving plans with my mentor as soon as possible.


Week eight: Summer coding report

Posted by squimrel on July 25, 2017 06:08 AM

Format a partition with the FAT32 filesystem

Modern file systems are not simple at all. FAT32 was introduced in 1996 and FAT is much older than that so it’s much simpler but still not very intuitive and easy to use. I couldn’t find a library that provides the functionality to format partitions which is weird. On Linux we usually use the mkfs.fat utility which is part of the dosfstools but dosfstools is not laid out to be used as a library.

There’s obviously the specification which I could implement but I only need it for one specific use case so that seemed overkill. The layout needed is basically always the same and looks like this when generated by mkfs.fat (ignoring empty space).

00000000: eb58 906d 6b66 732e 6661 7400 0208 2000  .X.mkfs.fat... .
00000010: 0200 0000 00f8 0000 3e00 f700 0098 2e00 ........>.......
00000020: 0028 c000 f82f 0000 0000 0000 0200 0000 .(.../..........
00000030: 0100 0600 0000 0000 0000 0000 0000 0000 ................
00000040: 8000 29fe caaf de4f 5645 524c 4159 2020 ..)....OVERLAY
00000050: 2020 4641 5433 3220 2020 0e1f be77 7cac FAT32 ...w|.
00000060: 22c0 740b 56b4 0ebb 0700 cd10 5eeb f032 ".t.V.......^..2
00000070: e4cd 16cd 19eb fe54 6869 7320 6973 206e .......This is n
00000080: 6f74 2061 2062 6f6f 7461 626c 6520 6469 ot a bootable di
00000090: 736b 2e20 2050 6c65 6173 6520 696e 7365 sk. Please inse
000000a0: 7274 2061 2062 6f6f 7461 626c 6520 666c rt a bootable fl
000000b0: 6f70 7079 2061 6e64 0d0a 7072 6573 7320 oppy and..press
000000c0: 616e 7920 6b65 7920 746f 2074 7279 2061 any key to try a
000000d0: 6761 696e 202e 2e2e 200d 0a00 0000 0000 gain ... .......
000001f0: 0000 0000 0000 0000 0000 0000 0000 55aa ..............U.
00000200: 5252 6141 0000 0000 0000 0000 0000 0000 RRaA............
000003e0: 0000 0000 7272 4161 fdf8 1700 0200 0000 ....rrAa........
000003f0: 0000 0000 0000 0000 0000 0000 0000 55aa ..............U.
00000c00: eb58 906d 6b66 732e 6661 7400 0208 2000 .X.mkfs.fat... .
00000c10: 0200 0000 00f8 0000 3e00 f700 0098 2e00 ........>.......
00000c20: 0028 c000 f82f 0000 0000 0000 0200 0000 .(.../..........
00000c30: 0100 0600 0000 0000 0000 0000 0000 0000 ................
00000c40: 8000 29fe caaf de4f 5645 524c 4159 2020 ..)....OVERLAY
00000c50: 2020 4641 5433 3220 2020 0e1f be77 7cac FAT32 ...w|.
00000c60: 22c0 740b 56b4 0ebb 0700 cd10 5eeb f032 ".t.V.......^..2
00000c70: e4cd 16cd 19eb fe54 6869 7320 6973 206e .......This is n
00000c80: 6f74 2061 2062 6f6f 7461 626c 6520 6469 ot a bootable di
00000c90: 736b 2e20 2050 6c65 6173 6520 696e 7365 sk. Please inse
00000ca0: 7274 2061 2062 6f6f 7461 626c 6520 666c rt a bootable fl
00000cb0: 6f70 7079 2061 6e64 0d0a 7072 6573 7320 oppy and..press
00000cc0: 616e 7920 6b65 7920 746f 2074 7279 2061 any key to try a
00000cd0: 6761 696e 202e 2e2e 200d 0a00 0000 0000 gain ... .......
00000df0: 0000 0000 0000 0000 0000 0000 0000 55aa ..............U.
00004000: f8ff ff0f ffff ff0f f8ff ff0f 0000 0000 ................
00603000: f8ff ff0f ffff ff0f f8ff ff0f 0000 0000 ................
00c02000: 4f56 4552 4c41 5920 2020 2008 0000 0666 OVERLAY ....f
00c02010: f64a f64a 0000 0666 f64a 0000 0000 0000 .J.J...f.J......

I could go more in depth but if you’re interested you can always read the specification or the source code I wrote.

Some values vary based on the size of the partition like the number of sectors per cluster and the size of the fat data structure. It might also be smart to set a unique volume id.

Therefore I used the layout generated by mkfs.fat and only made minimal changes to it. The result is part of this PR.

GSoC: making static analysis with kiskadee

Posted by David Carlos on July 25, 2017 12:00 AM

This post will cover some static analysis made with kiskadee 0.2.2 [1], and what are our plans for the next major release. We will present two packages as example, demonstrating the use of the plugins and the analyzers available. Currently, kiskadee uses four static analyzers:

  • Cppcheck, version 1.79 [2]
  • Flawfinder, version 1.31 [3]
  • Clang-analyzer, version 3.9.1 [4]
  • Frama-c, version 1.fc25 [5]

In the production environment two plugins are running: the debian plugin and the anitya plugin. The anitya plugin is our mainly plugin, because with it we can monitor and analyze several upstream that are packed in Fedora. We already have talked deeply about this two plugins, here on the blog. The projets that we will show here was analyzed by the cppcheck and flawfinder tools. These projects are the Xe [6] project, monitored by the anitya plugin, and the acpitool [7] project, monitored by the debian plugin.

The Xe project was, initially, monitored by the Anitya service. The upstream released a new version, and this event was published on the fedmsg bus, by the Anitya service. The Figure One, shows the new release of the Xe project, and the Figure Two, shows the event published on fedmsg.

Figure One: New Xe release.

Figure Two: The new release event.

The anitya plugin behavior is presented on the Figure Four. Every time that the Anitya service, publish a new release event, the fedmsg-hub daemon will receive it and send it to the anitya plugin. If the new release is hosted in a place where the anitya plugin can retrieve it source code, a analysis will be made.

Figure Four: anitya plugin behavior.

The Figure Five shows a static analysis made by the flawfinder analyzer, on the Xe source code. This analysis was only possible because the anitya plugin can receive new release events published on fedmsg. On this post, we talk how this integration was made. The Figure Six shows a static analysis made by the cppcheck analyzer, also on the Xe source code.

Figure Five: Flawfinder analysis.

Figure Six: Cppcheck analysis.

The second project that we analyzed was the acpitool package, that was monitored by the debian plugin. The source code of this package was retrieved using the dget tool, available in the devscripts package. The Figure Seven presents part of the analysis made by cppcheck.

Figure Six: Cppcheck analysis.

All the analysis presented here, can be found on two backups of the kiskadee database, available here on the blog, on the following links:

You can download these backups, import then to a postgresql database, and check several other analysis that we already made. Note that this backups are before the architecture change, that we talked on the post Improvements in kiskadee architecture.

The next major release of kiskadee will bring something that we believe that will permit us to integrates kiskadee with other tools. We will start the development of a API, that will provide several endpoints to consume our database. This API is a new step to reach one of our objectives, that is facilitate the process of integrate static analyzers on the cycle of development of software.

[1]https://pagure.io/kiskadee/releases
[2]http://cppcheck.sourceforge.net/
[3]https://www.dwheeler.com/flawfinder/
[4]https://clang-analyzer.llvm.org/
[5]https://frama-c.com/
[6]https://github.com/chneukirchen/xe
[7]https://sourceforge.net/projects/acpitool/

GSoC: Improvements in kiskadee architecture

Posted by David Carlos on July 24, 2017 05:00 PM

Today I have released kiskadee 0.2.2. This minor release brings some architecture improvements, fix some bugs in the plugins and improve the log messages format. Initially, lets take a look in the kiskadee architecture implemented on the 0.2 release.

In this architecture we have two queues. One used by plugins, called packages_queue, to queue packages that should be analyzed. This queue is consumed by monitor component, to check if the enqueued package was not analyzed. The other queue, called analysis_queue, is consumed by the runner component, in order to receive from monitor, packages that must be analyzed. If a dequeued package not exists in the database, the monitor component will save it, and enqueue it in the analysis_queue queue. When a analysis is made, the runner component updates the package analysis in the database. Currently, kiskadee only generate analysis for projects implemented in C/C++, and this was a scope decision made by the kiskadee community. Analyze only projects implemented in this languages, makes several monitored packages not be analyzed, and this behavior, with the architecture 0.2, lead us to a serious problem: A package is saved in the database even if a analysis is not generated for it. This was making our database storing several packages without a static analysis of source code, turning on kiskadee a less useful tool for the ones that want to continuously check the quality of some projects.

The release 0.2.2 fix this architecture issue, creating a new queue used by runner component to send back to monitor, packages that was successfully analyzed. In these implementation, we remove from the runner source code all database operations, centering in monitor the responsibility of interact with the database. Only packages enqueued in the results_queue, will be saved in the database by the monitor component.

We also add a limit to all kiskadee queues, once that the rate that a plugin enqueue a package, is greater than the rate the runner run a analysis. With this limit, all queues will always have at most ten elements. This will make the volume of monitored projects proportional to the analyzed projects. The log messages was also improved, facilitating the tool debug. Some bug fix in Debian plugin was also made, and now some packages that were been missed, are been properly monitored. This architecture improvements make the behavior of kiskadee more stable, and this release is already running in a production environment.

Week seven: Summer coding report

Posted by squimrel on July 18, 2017 08:15 PM

The plan to get rid of all issues

Windows

The Virtual Disk Service can’t be used since I’m prohibited to try to pull Uuid.lib by the Fedora Packaging Guidelines.

Mac

I had a look at how to do this on macOS. The technical documentation in this area is even worse than on Windows in my humbly opinion.

The tool that works with partitions prefers to use the apple partition header if available. But I’d like to add a partition to the master boot record (mbr) partition table.

isohybrid

Quick refresher on why all the tools have trouble with the isohybrid layout.

For compatibility isohybrid fills three partition tables (mbr, gpt and apple) but I’d only like to add a primary partition to the mbr one. Basically it’s unclear to anyone which one to use since it’s not standard practice to use multiple partition tables.

The plan

Since there’s trouble with this on all three platforms which we target I’ve decided to manually add the partition to the mbr partition table.

This does work on Linux now but was a lot of trouble since it doesn’t integrate well with udisks and more debugging needs to be done to fix that.

It will definitely work on Windows and Mac next week if those two platforms don’t try to stop me. They always do.

I learned that writing things myself may be faster then trying to fix the world and whatever I do I should try to make the right decision early on because that saves a ton of time.

GSoC Update: Referint and More

Posted by Ilias Stamatis on July 12, 2017 03:48 PM

Last week I started working on dsconf support for referential integrity plug-in. Referential Integrity is a database mechanism that ensures relationships between related entries are maintained. For example, if a user’s entry is removed from the directory and Referential Integrity is enabled, the server also removes the user from any groups of which the user is a member.

While working on referint support, I discovered and reported a bug, as well as a previous request which was about replacing the plugin’s log file with standard DS log files. So I decided to take on those issues and delay the lib389 support until they are done. It was also my first attempt to do some work in the C codebase.

It turned out finally, that this logfile is not used for real server logging. It is actually part of how the plugin implements its asynchronous mode. So we couldn’t just simply get rid of it. After discussing this a bit with William, my mentor, I started a discussion on the mailing list about changing the implementation from using a file to using a queue. William even suggested to completely deprecate the asynchronous mode of referint. You can see the discussion here: https://lists.fedoraproject.org/archives/list/389-devel@lists.fedoraproject.org/thread/DB5YKUV4A2LVPPXP72OJ4KQC2H2B4G3W/?sort=date

Contrasting opinions were expressed in this debate and I’m waiting for a decision to be reached.

So, my goal at the moment is to try do some C work on the main codebase and move out of the lib389 framework for a while. This is not that easy though. Directory Server is pretty big; cloc returns more than half a million lines of code. Plus, because I don’t have much experience writing C, I’m now learning how to properly use gdb, ASan and other tools to make my life easier, while getting familiar with some parts of the C codebase.

At the same time, I’m also working on smaller python issues on lib389 such as:
https://pagure.io/lib389/issue/74
https://pagure.io/lib389/issue/78
https://pagure.io/lib389/issue/43

So to conclude, at the moment I’m constantly jumping between issues, with my main goal being to start writing non-trivial C patches for the server.


Week six: Summer coding report

Posted by squimrel on July 10, 2017 01:44 PM

Where we’re at

The Linux build is stalled until libblockdev switched from libparted to libfdisk.

Windows

Doing the partitioning work in Windows is way more complicated than it should be.

Ideally Virtual Disk Service (VDS) should be used to add and format the partition but the symbols that are needed to talk to the Virtual Disk Service COM interface are not present in MinGW most likely because it doesn’t have a shared library. That’s why I decided not to use VDS at first. I’ll come back to this later.

Currently diskpart.exe is used as an alternative since diskpart uses the VDS COM Interface. But diskpart has problems of itself since it gets in the way of locking and the documentation says you need to wait 15 seconds after it quit. Also talking to it is slow and reading its response is tedious.

There’s a newer API called Windows Storage Management API but it’s only available on Windows 8 and above so I haven’t looked at it.

There’s also an older set of tools that were introduced in Windows XP. In contrast to that VDS was introduced in Windows Vista. It provides a DeviceIoControl function with which you can do the partitioning work and there’s the Volume Management API for mounting.
But Windows XP only has one function for formatting a partition. It’s called SHFormatDrive but it got its own GUI and requires user interaction. Users on the internet argued back in the day that formatting a drive without user interaction would be evil.

On top of that it’s really hard to use the Windows XP API. It took me a long time to figure out how to use it to add a partition because certain things happen which you wouldn’t expect.

The reason why I started to use the Windows XP API in the first place is that adding a partition usingdiskpart.exe got in the way of disk locking. Up until now diskpart.exe was only used for restoring the drive which is totally separate from the writing process.

Virtual Disk Service

As mentioned linking against VDS is hard because one has to link against the static library since there’s no dynamic one provided by Microsoft in this case. But I now got a working dummy program that loads VDS via COM.

To accomplish this I had to look at the vds.h provided by Microsoft and figure out how to write my own minimal one. That also seems to be the approach of MinGW to avoid the licensing issue. The reason why I had to look at the vds.h at all is that there’re some GUIDs which aren’t documented and you just can’t guess.
Also another evil part is that I needed to get the static library Uuid.Lib to link against it. I’m not sure how exactly the licensing works in that case but I guess as long as it’s not in the repository and just pulled from the internet to build the Windows target it should be fine.
I’m just not sure where to best pull it from yet. Also it seems pretty evil (security-wise) to pull byte-code from the internet and include it in another binary.

GSoC: The evolution of Kiskadee models.

Posted by David Carlos on July 08, 2017 01:50 PM

The 0.1 release of Kiskadee [1] brought a minimal set of functionalities that permitted us to monitor some targets and analysis some packages, serving more as a proof of concept than as a real monitoring system. One of the limitations of the 0.1 release was that Kiskadee was saving all the analysis in a single field on the packages table. Every time that we need to generate a new analysis for a same package version, we appended the result on the analysis field. The next two code snippet was picked up from the Kiskadee source code.

reports = []
...
with kiskadee.helpers.chdir(path):
        kiskadee.logger.debug('ANALYSIS: Unpacked!')
        analyzers = plugin.analyzers()
        for analyzer in analyzers:
                kiskadee.logger.debug('ANALYSIS: running %s ...' % analyzer)
                analysis = kiskadee.analyzers.run(analyzer, path)
                firehose_report = kiskadee.converter.to_firehose(analysis,
                                                         analyzer)
                reports.append(str(firehose_report))
kiskadee.logger.debug('ANALYSIS: DONE running %s' % analyzer)
return reports
all_analyses = '\n'.join(reports)

Note that we generate a analysis for each plugin analyzer, and append it in the reports array. In other part of the runner code we generate, using a join function, a single string with all the analysis made. A bad implementation to a issue that should be solved by changing the Kiskadee models in order to support several analysis for a same package version, and that was what we have done with the PR #24.With this pull request, Kiskadee will be able to save different analysis from a same package version made by different analyzers. A lot of refactoring was made in the Runner component, specially because was difficult to implement new tests for it. The responsibility of save a analysis was removed from the analyze method, and moved to the _save_source_analysis.

def _save_source_analysis(source_to_analysis, analysis, analyzer, session):
if analysis is None:
        return None

source_name = source_to_analysis['name']
source_version = source_to_analysis['version']

package = (
        session.query(kiskadee.model.Package)
        .filter(kiskadee.model.Package.name == source_name).first()
)
version_id = package.versions[-1].id
_analysis = kiskadee.model.Analysis()
try:
        _analyzer = session.query(kiskadee.model.Analyzer).\
            filter(kiskadee.model.Analyzer.name == analyzer).first()
        _analysis.analyzer_id = _analyzer.id
        _analysis.version_id = version_id
        _analysis.raw = analysis
        session.add(_analysis)
except Exception as err:
        kiskadee.logger.debug(
        "The required analyzer was not registered on Kiskadee"
        )
        kiskadee.logger.debug(err)

With this pull request two new models was added to Kiskadee, analysis and analyzers. The Diagram 1 presents the current database structure of Kiskadee.

Diagram 1: Kiskadee database structure.

With the PR #24 we will fix some other issues:

  • Docker not run properly on the Jenkins VM. Issue #23
    • Docker and selinux are not good friends so we have to change the way that we were creating a container volume to run the analyzers.
  • CI integration. Issue #7
    • We now have a Jenkins instance running our tests for every push made to the pagure repository. In the future we want to add a continuous deploy to every code that be merged on master branch.
[1]https://pagure.io/kiskadee

ReFS Part III - Back to the Resilience

Posted by Mo Morsi on July 05, 2017 07:51 PM

We've made some great headway on the ReFS filesystem anaylsis front to the point of being able to implement a rudimentary file extraction mechanism (complete with timestamps).

First a recap of the story so far:

  • ReFS, aka "The Resilient FileSystem" is a relatively new filesystem developed by Microsoft. First shipped in Windows Server 2012, it has since seen an increase in popularity and use, especially in enterprise and cloud environments.
  • Little is known about the ReFS internals outside of some sparse information provided by Microsoft. According to that, data is organized into pages of a fixed size, starting at a static position on the disk. The first round of analysis was to determine the boundaries of these top level organizational units to be able to scan the disk for high level structures.
  • Once top level structures, including the object table and root directory, were identified, each was analyzed in detail to determine potential parsable structures such as generic Attribute and Record entities as well as file and directory references.
  • The latest round of analysis consisted of diving into these entities in detail to try and deduce a mechanism which to extract file metadata and content

Before going into details, we should note this analysis is based on observations against ReFS disks generated locally, without extensive sequential cross-referencing and comparison of many large files with many changes. Also it is possible that some structures are oversimplified and/or not fully understood. That being said, this should provide a solid basis for additional analysis, getting us deep into the filesystem, and allowing us to poke and prod with isolated bits to identify their semantics.

Now onto the fun stuff!


- A ReFS filesystem can be identified with the following signature at the very start of the partition:

    00 00 00 52  65 46 53 00  00 00 00 00  00 00 00 00 ...ReFS.........
    46 53 52 53  XX XX XX XX  XX XX XX XX  XX XX XX XX FSRS

- The following Ruby code will tell you if a given offset in a given file contains a ReFS partition:

    # Point this to the file containing the disk image
    DISK="~/ReFS-disk.img"

    # Point this at the start of the partition containing the ReFS filesystem
    ADDRESS=0x500000

    # FileSystem Signature we are looking for
    FS_SIGNATURE  = [0x00, 0x00, 0x00, 0x52, 0x65, 0x46, 0x53, 0x00] # ...ReFS.

    img = File.open(File.expand_path(DISK), 'rb')
    img.seek ADDRESS
    sig = img.read(FS_SIGNATURE.size).unpack('C*')
    puts "Disk #{sig == FS_SIGNATURE ? "contains" : "does not contain"} ReFS filesystem"

- ReFS pages are 0x4000 bytes in length

- On all inspected systems, the first page number is 0x1e (0x78000 bytes after the start of the partition containing the filesystem). This is inline w/ Microsoft documentation which states that the first metadata dir is at a fixed offset on the disk.

- Other pages contain various system, directory, and volume structures and tables as well as journaled versions of each page (shadow-written upon regular disk writes)


- The first byte of each page is its Page Number

- The first 0x30 bytes of every metadata page (dubbed the Page Header) seem to follow a certain pattern:

    byte  0: XX XX 00 00   00 00 00 00   YY 00 00 00   00 00 00 00
    byte 16: 00 00 00 00   00 00 00 00   ZZ ZZ 00 00   00 00 00 00
    byte 32: 01 00 00 00   00 00 00 00   00 00 00 00   00 00 00 00
  • dword 0 (XX XX) is the page number which is sequential and corresponds to the 0x4000 offset of the page
  • dword 2 (YY) is the journal number or sequence number
  • dword 6 (ZZ ZZ) is the "Virtual Page Number", which is non-sequential (eg values are in no apparent order) and seem to tie related pages together.
  • dword 8 is always 01, perhaps an "allocated" flag or other

- Multiple pages may share a virtual page number (byte 24/dword 6) but usually don't appear in sequence.

- The following Ruby code will print out the pages in a ReFS partition along w/ their shadow copies:

    # Point this to the file containing the disk image
    DISK="~/ReFS-disk.img"
    
    # Point this at the start of the partition containing the ReFS filesystem
    ADDRESS=0x500000
    
    PAGE_SIZE=0x4000
    PAGE_SEQ=0x08
    PAGE_VIRTUAL_PAGE_NUM=0x18
    
    FIRST_PAGE = 0x1e
    
    img = File.open(File.expand_path(DISK), 'rb')
    
    page_id = FIRST_PAGE
    img.seek(ADDRESS + page_id*PAGE_SIZE)
    while contents = img.read(PAGE_SIZE)
      id = contents.unpack('S').first
      if id == page_id
        pos = img.pos
    
        start = ADDRESS + page_id * PAGE_SIZE
    
        img.seek(start + PAGE_SEQ)
        seq = img.read(4).unpack("L").first
    
        img.seek(start + PAGE_VIRTUAL_PAGE_NUM)
        vpn = img.read(4).unpack("L").first
    
        print "page: "
        print "0x#{id.to_s(16).upcase}".ljust(7)
        print " @ "
        print "0x#{start.to_s(16).upcase}".ljust(10)
        print ": Seq - "
        print "0x#{seq.to_s(16).upcase}".ljust(7)
        print "/ VPN - "
        print "0x#{vpn.to_s(16).upcase}".ljust(9)
        puts
    
        img.seek pos
      end
      page_id += 1
    end

- The object table (virtual page number 0x02) associates object ids' with the pages on which they reside. Here we an AttributeList consisting of Records of key/value pairs (see below for the specifics on these data structures). We can lookup the object id of the root directory (0x600000000) to retrieve the page on which it resides:

   50 00 00 00 10 00 10 00 00 00 20 00 30 00 00 00 - total length / key & value boundries
   00 00 00 00 00 00 00 00 00 06 00 00 00 00 00 00 - object id
   F4 0A 00 00 00 00 00 00 00 00 02 08 08 00 00 00 - page id / flags
   CE 0F 85 14 83 01 DC 39 00 00 00 00 00 00 00 00 - checksum
   08 00 00 00 08 00 00 00 04 00 00 00 00 00 00 00

^ The object table entry for the root dir, containing its page (0xAF4)

- When retrieving pages by id of virtual page number, look for the ones with the highest sequence number as those are the latest copies of the shadow-write mechanism.

- Expanding upon the previous example we can implement some logic to read and dump the object table:

    ATTR_START = 0x30
    
    def img
      @img ||= File.open(File.expand_path(DISK), 'rb')
    end
    
    def pages
      @pages ||= begin
        _pages = {}
        page_id = FIRST_PAGE
        img.seek(ADDRESS + page_id*PAGE_SIZE)
    
        while contents = img.read(PAGE_SIZE)
          id = contents.unpack('S').first
          if id == page_id
            pos = img.pos
            start = ADDRESS + page_id * PAGE_SIZE
            img.seek(start + PAGE_SEQ)
            seq = img.read(4).unpack("L").first
    
            img.seek(start + PAGE_VIRTUAL_PAGE_NUM)
            vpn = img.read(4).unpack("L").first
            _pages[id] = {:id => id, :seq => seq, :vpn => vpn}
            img.seek pos
          end
    
          page_id += 1
        end
    
        _pages
      end
    end
    
    def page(opts)
      if opts.key?(:id)
        return pages[opts[:id]]
      elsif opts[:vpn]
        return pages.values.select { |v|
          v[:vpn] == opts[:vpn]
        }.sort { |v1, v2| v1[:seq] <=> v2[:seq] }.last
      end
    
      nil
    end
    
  
    def obj_pages
      @obj_pages ||= begin
        obj_table = page(:vpn => 2)
  
        img.seek(ADDRESS + obj_table[:id] * PAGE_SIZE)
        bytes = img.read(PAGE_SIZE).unpack("C*")
        len1 = bytes[ATTR_START]
        len2 = bytes[ATTR_START+len1]
        start = ATTR_START + len1 + len2
  
        objs = {}
  
        while bytes.size > start && bytes[start] != 0
          len = bytes[start]
          id  = bytes[start+0x10..start+0x20-1].collect { |i| i.to_s(16).upcase }.reverse.join()
          tgt = bytes[start+0x20..start+0x21].collect   { |i| i.to_s(16).upcase }.reverse.join()
          objs[id] = tgt
          start += len
        end
  
        objs
      end
    end
  
    obj_pages.each { |id, tgt|
      puts "Object #{id} is on page #{tgt}"
    }

We could also implement a method to lookup a specific object's page:

    def obj_page(obj_id)
      obj_pages[obj_id]
    end

    puts page(:id => obj_page("0000006000000000").to_i(16))

This will retrieve the page containing the root directory


- Directories, from the root dir down, follow a consistent pattern. They are comprised of sequential lists of data structures whose length is given by the first word value (Attributes and Attribute Lists).

List are often prefixed with a Header Attribute defining the total length of the Attributes that follow that consititute the list. Though this is not a hard set rule as in the case where the list resides in the body of another Attribute (more on that below).

In either case, Attributes may be parsed by iterating over the bytes after the directory page header, reading and processing the first word to determine the next number of bytes to read (minus the length of the first word), and then repeating until null (0000) is encountered (being sure to process specified padding in the process)

- Various Attributes take on different semantics including references to subdirs and files as well as branches to additional pages containing more directory contents (for large directories); though not all Attributes have been identified.

The structures in a directory listing always seem to be of one of the following formats:

- Base Attribute - The simplest / base attribute consisting of a block whose length is given at the very start.

An example of a typical Attribute follows:

      a8 00 00 00  28 00 01 00  00 00 00 00  10 01 00 00  
      10 01 00 00  02 00 00 00  00 00 00 00  00 00 00 00  
      00 00 00 00  00 00 00 00  a9 d3 a4 c3  27 dd d2 01  
      5f a0 58 f3  27 dd d2 01  5f a0 58 f3  27 dd d2 01  
      a9 d3 a4 c3  27 dd d2 01  20 00 00 00  00 00 00 00  
      00 06 00 00  00 00 00 00  03 00 00 00  00 00 00 00  
      5c 9a 07 ac  01 00 00 00  19 00 00 00  00 00 00 00  
      00 00 01 00  00 00 00 00  00 00 00 00  00 00 00 00  
      00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  
      00 00 00 00  00 00 00 00  01 00 00 00  00 00 00 00  
      00 00 00 00  00 00 00 00

Here we a section of 0xA8 length containing the following four file timestamps (more on this coversion below)

       a9 d3 a4 c3  27 dd d2 01 - 2017-06-04 07:43:20
       5f a0 58 f3  27 dd d2 01 - 2017-06-04 07:44:40
       5f a0 58 f3  27 dd d2 01 - 2017-06-04 07:44:40
       a9 d3 a4 c3  27 dd d2 01 - 2017-06-04 07:43:20

It is safe to assume that either

  • one of the first fields in any given Attribute contains an identifier detailing how the attribute should be parsed _or_
  • the context is given by the Attribute's position in the list.

The following is a method which can be used to parse a given Attribute off disk, provided the img read position is set to its start:

    def read_attr
      pos = img.pos
      packed = img.read(4)
      return new if packed.nil?
      attr_len = packed.unpack('L').first
      return new if attr_len == 0

      img.seek pos
      value = img.read(attr_len)
      Attribute.new(:pos   => pos,
                    :bytes => value.unpack("C*"),
                    :len   => attr_len)
    end

- Records - Key / Value pairs whose total length and key / value lengths are given in the first 0x20 bytes of the attribute. These are used to associated metadata sections with files whose names are recorded in the keys and contents are recorded in the value.

An example of a typical Record follows:

    40 04 00 00   10 00 1A 00   08 00 30 00   10 04 00 00   @.........0.....
    30 00 01 00   6D 00 6F 00   66 00 69 00   6C 00 65 00   0...m.o.f.i.l.e.
    31 00 2E 00   74 00 78 00   74 00 00 00   00 00 00 00   1...t.x.t.......
    A8 00 00 00   28 00 01 00   00 00 00 00   10 01 00 00   ¨...(...........
    10 01 00 00   02 00 00 00   00 00 00 00   00 00 00 00   ................
    00 00 00 00   00 00 00 00   A9 D3 A4 C3   27 DD D2 01   ........©Ó¤Ã'ÝÒ.
    5F A0 58 F3   27 DD D2 01   5F A0 58 F3   27 DD D2 01   _ Xó'ÝÒ._ Xó'ÝÒ.
    A9 D3 A4 C3   27 DD D2 01   20 00 00 00   00 00 00 00   ©Ó¤Ã'ÝÒ. .......
    00 06 00 00   00 00 00 00   03 00 00 00   00 00 00 00   ................
    5C 9A 07 AC   01 00 00 00   19 00 00 00   00 00 00 00   \..¬............
    00 00 01 00   00 00 00 00   00 00 00 00   00 00 00 00   ................
    00 00 00 00   00 00 00 00   00 00 00 00   00 00 00 00   ................
    00 00 00 00   00 00 00 00   01 00 00 00   00 00 00 00   ................
    00 00 00 00   00 00 00 00   20 00 00 00   A0 01 00 00   ........ ... ...
    D4 00 00 00   00 02 00 00   74 02 00 00   01 00 00 00   Ô.......t.......
    78 02 00 00   00 00 00 00 ...(cutoff)                   x.......

Here we see the Record parameters given by the first row:

  • total length - 4 bytes = 0x440
  • key offset - 2 bytes = 0x10
  • key length - 2 bytes = 0x1A
  • flags / identifer - 2 bytes = 0x08
  • value offset - 2 bytes = 0x30
  • value length - 2 bytes = 0x410

Naturally, the Record finishes after the value, 0x410 bytes after the value start at 0x30, or 0x440 bytes after the start of the Record (which lines up with the total length).

We also see that this Record corresponds to a file I created on disk as the key is the Record flag (0x10030) followed by the filename (mofile1.txt).

Here the first attribute in the Record value is the simple attribute we discussed above, containing the file timestamps. The File Reference Attribute List Header follows (more on that below).

From observation Records w/ flag values of '0' or '8' are what we are looking for, while '4' occurs often, this almost always seems to indicate a Historical Record, or a Record that has since been replaced with another.

Since Records are prefixed with their total length, they can be thought of a subclass of Attribute. The following is a Ruby class that uses composition to dispatch record field lookup calls to values in the underlying Attribute:

    class Record
      attr_accessor :attribute

      def initialize(attribute)
        @attribute = attribute
      end

      def key_offset
        @key_offset ||= attribute.words[2]
      end

      def key_length
        @key_length ||= attribute.words[3]
      end

      def flags
        @flags ||= attribute.words[4]
      end

      def value_offset
        @value_offset ||= attribute.words[5]
      end

      def value_length
        @value_offset ||= attribute.words[6]
      end

      def key
        @key ||= begin
          ko, kl, vo, vl = boundries
          attribute.bytes[ko...ko+kl].pack('C*')
        end
      end

      def value
        @value ||= begin
          ko, kl, vo, vl = boundries
          attribute.bytes[vo..-1].pack('C*')
        end
      end

      def value_pos
        attribute.pos + value_offset
      end

      def key_pos
        attribute.pos + key_offset
      end
    end # class Record

- AttributeList - These are more complicated but interesting. At first glance they are simple Attributes of length 0x20 but upon further inspection we consistently see it contains the length of a large block of Attributes (this length is inclusive, as it contains this first one). After parsing this Attribute, dubbed the 'List Header', we should read the remaining bytes in the List as well as the padding, before arriving at the next Attribute

   20 00 00 00   A0 01 00 00   D4 00 00 00   00 02 00 00 <- list header specifying total length (0x1A0) and padding (0xD4)
   74 02 00 00   01 00 00 00   78 02 00 00   00 00 00 00
   80 01 00 00   10 00 0E 00   08 00 20 00   60 01 00 00
   60 01 00 00   00 00 00 00   80 00 00 00   00 00 00 00
   88 00 00 00  ... (cutoff)

Here we see an Attribute of 0x20 length, that contains a reference to a larger block size (0x1A0) in its third word.

This can be confirmed by the next Attribute whose size (0x180) is the larger block size minute the length of the header (0x1A0 - 0x20). In this case the list only contains one item/child attribute.

In general a simple strategy to parse the entire case would be to:

  • Parse Attributes individually as normal
  • If we encounter a List Header Attribute, we calculate the size of the list (total length minus header length)
  • Then continue parsing Attributes, adding them to the list until the total length is completed.

It also seems that:

  • the padding that occurs after the list is given by header word number 5 (in this case 0xD4). After the list is parsed, we consistently see this many null bytes before the next Attribute begins (which is not part of & unrelated to the list).
  • the type of list is given by its 7th word; directory contents correspond to 0x200 while directory branches are indicated with 0x301

Here is a class that represents an AttributeList header attribute by encapsulating it in a similar manner to Record above:

    class AttributeListHeader
      attr_accessor :attribute

      def initialize(attr)
        @attribute = attr
      end

      # From my observations this is always 0x20
      def len
        @len ||= attribute.dwords[0]
      end

      def total_len
        @total_len ||= attribute.dwords[1]
      end

      def body_len
        @body_len ||= total_len - len
      end

      def padding
        @padding ||= attribute.dwords[2]
      end

      def type
        @type ||= attribute.dwords[3]
      end

      def end_pos
        @end_pos ||= attribute.dwords[4]
      end

      def flags
        @flags ||= attribute.dwords[5]
      end

      def next_pos
        @next_pos ||= attribute.dwords[6]
      end
    end

Here is a method to parse the actual Attribute List assuming the image read position is set to the beginning of the List Header

    def read_attribute_list
      header        = Header.new(read_attr)
      remaining_len = header.body_len
      orig_pos      = img.pos
      bytes         = img.read remaining_len
      img.seek orig_pos

      attributes = []

      until remaining_len == 0
        attributes    << read_attr
        remaining_len -= attributes.last.len
      end

      img.seek orig_pos - header.len + header.end_pos

      AttributeList.new :header     => header,
                        :pos        => orig_pos,
                        :bytes      => bytes,
                        :attributes => attributes
    end

Now we have most of what is needed to locate and parse individual files, but there are a few missing components including:

- Directory Tree Branches: These are Attribute Lists where each Attribute corresponds to a record whose value references a page which contains more directory contents.

Upon encountering an AttributeList header with flag value 0x301, we should

  • iterate over the Attributes in the list,
  • parse them as Records,
  • use the first dword in each value as the page to repeat the directory traversal process (recursively).

Additional files and subdirs found on the referenced pages should be appended to the list of current directory contents.

Note this is the (an?) implementation of the BTree structure in the ReFS filesystem described by Microsoft, as the record keys contain the tree leaf identifiers (based on file and subdirectory names).

This can be used for quick / efficient file and subdir lookup by name (see 'optimization' in 'next steps' below)

- SubDirectories: these are simply Records in the directory's Attribute List whose key contains the 0x20030 flag as well as the subdir name.

The value of this Record is the corresponding object id which can be used to lookup the page containing the subdir in the object table.

A typical subdirectory Record

    70 00 00 00  10 00 12 00  00 00 28 00  48 00 00 00  
    30 00 02 00  73 00 75 00  62 00 64 00  69 00 72 00  <- here we see the key containing the flag (30 00 02 00) followed by the dir name ("subdir2")
    32 00 00 00  00 00 00 00  03 07 00 00  00 00 00 00  <- here we see the object id as the first qword in the value (0x730)
    00 00 00 00  00 00 00 00  14 69 60 05  28 dd d2 01  <- here we see the directory timestamps (more on those below)
    cc 87 ce 52  28 dd d2 01  cc 87 ce 52  28 dd d2 01  
    cc 87 ce 52  28 dd d2 01  00 00 00 00  00 00 00 00  
    00 00 00 00  00 00 00 00  00 00 00 10  00 00 00 00

- Files: like directories are Records whose key contains a flag (0x10030) followed by the filename.

The value is far more complicated though and while we've discovered some basic Attributes allowing us to pull timestamps and content from the fs, there is still more to be deduced as far as the semantics of this Record's value.

- The File Record value consists of multiple attributes, though they just appear one after each other, without a List Header. We can still parse them sequentially given that all Attributes are individually prefixed with their lengths and the File Record value length gives us the total size of the block.

- The first attribute contains 4 file timestamps at an offset given by the fifth byte of the attribute (though this position may be coincidental an the timestamps could just reside at a fixed location in this attribute).

In the first attribute example above we see the first timestamp is

       a9 d3 a4 c3  27 dd d2 01

This corresponds to the following date

        2017-06-04 07:43:20

And may be converted with the following algorithm:

          tsi = TIMESTAMP_BYTES.pack("C*").unpack("Q*").first
          Time.at(tsi / 10000000 - 11644473600)

Timestamps being in nanoseconds since the Windows Epoch Data (11644473600 = Jan 1, 1601 UTC)

- The second Attribute seems to be the Header of an Attribute List containing the 'File Reference' semantics. These are the Attributes that encapsulate the file length and content pointers.

I'm assuming this is an Attribute List so as to contain many of these types of Attributes for large files. What is not apparent are the full semantics of all of these fields.

But here is where it gets complicated, this List only contains a single attribute with a few child Attributes. This encapsulation seems to be in the same manner as the Attributes stored in the File Record value above, just a simple sequential collection without a Header.

In this single attribute (dubbed the 'File Reference Body') the first Attribute contains the length of the file while the second is the Header for yet another List, this one containing a Record whose value contains a reference to the page which the file contents actually reside.

      ----------------------------------------
      | ...                                  |
      ----------------------------------------
      | File Entry Record                    |
      | Key: 0x10030 [FileName]              |
      | Value:                               |
      | Attribute1: Timestamps               |
      | Attribute2:                          |
      |   File Reference List Header         |
      |   File Reference List Body(Record)   |
      |     Record Key: ?                    |
      |     Record Value:                    |
      |       File Length Attribute          |
      |       File Content List Header       |
      |       File Content Record(s)         |
      | Padding                              |
      ----------------------------------------
      | ...                                  |
      ----------------------------------------

While complicated each level can be parsed in a similar manner to all other Attributes & Records, just taking care to parse Attributes into their correct levels & structures.

As far as actual values,

  • the file length is always seen at a fixed offset within its attribute (0x3c) and
  • the content pointer seems to always reside in the second qword of the Record value. This pointer is simply a reference to the page which the file contents can be read verbatim.

---

And that's it! An example implementation of all this logic can be seen in our expiremental 'resilience' library found here:

https://github.com/movitto/resilience

The next steps would be to

  • expand upon the data structures above (verify that we have interpreted the existing structures correctly)
  • deduce full Attribute and Record semantics so as to be able to consistently parse files of any given length, with any given number of modifications out of the file system

And once we have done so robustly, we can start looking at optimization, possibly rolling out some expiremental production logic for ReFS filesystem support!

... Cha-ching $ £ ¥ ¢ ₣ ₩ !!!!

RetroFlix / PI Switch Followup

Posted by Mo Morsi on July 05, 2017 07:22 PM

I've been trying to dedicate some cycles to wrapping up the Raspberry PI entertainment center project mentioned a while back. I decided to abandon the PI Switch idea as the original controller which was purchased for it just did not work properly (or should I say only worked sporadically/intermitantly). It being a cheap device bought online, it wasn't worth the effort to debug (funny enough I can't find the device on Amazon anymore, perhaps other people were having issues...).

Not being able to find another suitable gamepad to use as the basis for a snap together portable device, I bought a Rii wireless controller (which works great out of the box!) and dropped the project (also partly due to lack of personal interest). But the previously designed wall mount works great, and after a bit of work the PI now functions as a seamless media center.

Unfortunately to get it there, a few workarounds were needed. These are listed below (in no particular order).

<style> #rpi_setup li{ margin-bottom: 10px; } </style>
  • To start off, increase your GPU memory. This we be needed to run games with any reasonable performance. This can be accomplished through the Raspberry PI configuration interface.

    Rpi setup1 Rpi setup2

    Here you can also overclock your PI if your model supports it (v3.0 does not as evident w/ the screenshot, though there are workarounds)

  • If you are having trouble w/ the PI output resolution being too large / small for your tv, try adjusting the aspect ratio on your set. Previously mine was set to "theater mode", cutting off the edges of the device output. Resetting it to normal resolved the issue.

    Rpi setup3 Rpi setup5 Rpi setup4
  • To get the Playstation SixAxis controller working via bluetooth required a few steps.
    • Unplug your playstation (since it will boot by default when the controller is activated)
    • On the PI, run
              sudo bluetoothctl
      
    • Start the controller and watch for a new devices in the bluetoothctl output. Make note of the device id
    • Still in the bluetoothctl command prompt, run
              trust [deviceid]
      
    • In the Raspberry PI bluetooth menu, click 'make discoverable' (this can also be accomplished via the bluetoothctl command prompt with the discoverable on command) Rpi setup6
    • Finally restart the controller and it should autoconnect!
  • To install recent versions of Ruby you will need to install and setup rbenv. The current version in the RPI repos is too old to be of use (of interest for RetroFlix, see below)
  • Using mednafen requires some config changes, notabley to disable opengl output and enable SDL. Specifically change the following line from
          video.driver opengl
    
    To
          video.driver sdl
    
    Unfortunately after alot of effort, I was not able to get mupen64 working (while most games start, as confirmed by audio cues, all have black / blank screens)... so no N64 games on the PI for now ☹
  • But who needs N64 when you have Nethack! ♥‿♥(the most recent version of which works flawlessly). In addition to the small tweaks needed to compile the latest version on Linux, inorder to get the awesome Nevanda tileset working, update include/config.h to enable XPM graphics:
        -/* # define USE_XPM */ /* Disable if you do not have the XPM library */
        +#define USE_XPM  /* Disable if you do not have the XPM library */
    
    Once installed, edit your nh/install/games/lib/nethackdir/NetHack.ad config file (in ~ if you installed nethack there), to reference the newtileset:
        -NetHack.tile_file: x11tiles
        +NetHack.tile_file: /home/pi/Downloads/Nevanda.xpm
    

Finally RetroFlix received some tweaking & love. Most changes were visual optimizations and eye candy (including some nice retro fonts and colors), though a workers were also added so the actual downloads could be performed without blocking the UI. Overall it's simple and works great, a perfect portal to work on those high scores!

That's all for now, look for some more updates on the ReFS front in the near future!

Week five: Summer coding report

Posted by squimrel on July 04, 2017 01:35 AM

About the issue of last week we decided to tell libblockdev to use libfdisk instead of libparted since there are rumors that they would like to do that anyways but I’m not working on that for now since we’ll have to make progress with the actual project at stake.

Windows

This week I worked on porting the FMW to Windows. To do that I had to build and package iso9660io for MinGW which really was not a nice thing to do.

Since I moved isomd5sum out of the FMW projects source code I had to build and package isomd5sum for MinGW too.

I got source code on my end that should be doing most of what’s needed to get persistent storage working on Windows but it’s using diskpart and I would like to move away from that tool since it messes up a lot of things.

My mentor warned me that the Windows C APIs for these things are terrible and he was totally right.

GSoC: How integrates fedmsg with Kiskadee

Posted by David Carlos on June 29, 2017 06:36 PM

If you want to know why we are using fedmsg [1] with Kiskadee [2] you can check this post, where I explain the reasons for such integration. Now we will cover the implementation of this integration and the current status of Kiskadee's architecture.

fedmsg-hub it's a daemon used to interact with the fedmsg bus, and with it we can receive and send messages from and to applications. If you have cloned Kiskadee repository, create the fedmsg-hub configurations files and run the daemon.

sudo mkdir -p /etc/fedmsg.d/
sudo cp util/base.py util/endpoints.py  /etc/fedmsg.d/
sudo cp util/anityaconsumer.py /etc/fedmsg.d/
pip install -e .
PYTHONPATH=`pwd` fedmsg-hub

If everything goes ok, fedmsg-hub will start, and will use our AnityaConsumer class to consume the fedmsg bus. The endpoints.py file will tell to fedmsg-hub a list of addresses from which fedmsg can send messages, in our case, this endpoint will point to a Anitya server, where new project releases are published. Basically, a fedmsg-hub consumer is a class that inherits from fedmsg.consumers.FedmsgConsumer and implements a consume method.

The fedmsg-hub daemon runs in a separate process of Kiskadee, so we need some mechanism to make the consumer send the bus messages to Kiskadee. To do this we are using ZeroMQ [3] as a pub/sub library, in a way that the consumer publishes the incoming messages, and the anitya plugin on Kiskadee consumes this messages.

Once the message arrives in Kiskadee, the default life cycle of a source will occur.

  • The Anitya plugin will queue the source.
  • the monitor component will dequeue the source.
    • If the source version not exists on the database, save it, and queue the source to the runner component.
    • If the source version exists on the database, do nothing.
  • the runner component will dequeue the source.
  • the runner component will run a analysis, and save the result on the database.

On this post we have a better description of the Kiskadee architecture. Each service that publishes on fedmsg have several topics, each on related to a specific event. Here you can check a list of topics where Anitya publish messages. Our consumer have only one responsibility: When a message arrives, publish it on ZeroMQ server, on the anitya topic. Kiskadee will be listen to this topic, and will receive the message. Let's take a look on the consumer code:

class AnityaConsumer(fedmsg.consumers.FedmsgConsumer):
"""Consumer used by fedmsg-hub to subscribe to fedmsg bus."""

topic = 'org.release-monitoring.prod.anitya.project.version.update'
config_key = 'anityaconsumer'
validate_signatures = False

def __init__(self, *args, **kw):
        """Anityaconsumer constructor."""
        super().__init__(*args, *kw)
        context = zmq.Context()
        self.socket = context.socket(zmq.PUB)
        self.socket.bind("tcp://*:5556")

def consume(self, msg):
        """Consume events from fedmsg-hub."""
        self.socket.send_string("%s %s" % ("anitya", str(msg)))
        time.sleep(1)

ZeroMQ is also used by fedmsg, so the topic variable can cover more than one Anitya topic. If we want to receive all messages published by Anitya, we just need to use a more generic topic value:

topic = 'org.release-monitoring.prod.anitya.project.*'
[1]http://www.fedmsg.com/en/latest/
[2]https://pagure.io/kiskadee
[3]http://zeromq.org/

MemberOf Support Is Complete

Posted by Ilias Stamatis on June 28, 2017 10:09 PM

lib389 support for MemberOf plug-in is finally complete!

Here’s what I have implemented so far regarding this issue:

  • Code for configuring the plug-in using our LDAP ORM system.
  • The wiring in the dsconf tool so we can manage the plug-in from the command line.
  • Some generic utility functions to use for testing this and all future plug-ins.
  • Functional tests.
  • Command-line tests.
  • A new Task class for managing task entries based on the new lib389 code.
  • The fix-up task for MemberOf.

I have proudly written a total of 40 test cases; 8 functional and 32 cli tests.

All of my commits that have been merged into the project up to this point – not only related to MemberOf, but in general – can be found here: https://pagure.io/lib389/commits/master?author=stamatis.iliass%40gmail.com

As I’ve said again in a previous post, I have additionally discovered and reported a few bugs related to the C code of MemberOf. I have written reproducers for some of them too (test cases that prove the erroneous behavior).

At the same time, I was working on USN plug-in support as well. This is about tracking modifications to the database by using Update Sequence Numbers. When enabled, sequence numbers are assigned to an entry whenever a write operation is performed against the entry. This value is stored in a special operational attribute for each entry, called “entryusn”. The process for me was pretty much the same; config code, dsconf wiring, tests, etc. This work is also almost complete and hopefully it will be merged soon as well.

To conclude, during this first month of Google Summer of Code, I have worked on MemberOf and USN plug-ins integration, did code reviews on other team members’ patches, and worked on other small issues.


Why integrates fedmsg with kiskadee

Posted by David Carlos on June 28, 2017 06:20 PM

On this post we will talk why we decides to integrate kiskadee [1] with fedmsg [2], and how this integration will enable us to easily monitors the source code of several different projects. The next post will explain how this integration was done.

Exists a initiative on Fedora called Anitya [3]. Anitya is a project version monitoring system, that monitor upstream releases and broadcast them on fedmsg. The registration of a project on Anitya it's quite simple. You will need to informe the homepage, which system used to host the project, and some other informations required by the system. After the registration process, Anitya will check, every day, if a new release of the project were released. If so, it will publish the new release, using a JSON format, on fedmsg. In the context of anitya, the systems used to host projects are called backends, and you can check all the supported backends on this link https://release-monitoring.org/about.

The Fedora infrastructure have several different services that need to talk to each other. One simple exemple is the AutoQA service, that listen to some events triggered by the fedpkg library. If we have only two services interacting the problem is minimal, but when several applications request and response to other several applications, the problem becomes huge. fedmsg (FEDerated MeSsaGe bus) is a python package and API defining a brokerless messaging architecture to send and receive messages to and from applications. Anitya uses this messaging architecture to publish on the bus the new releases of registered projects. Any application that is subscribed to the bus, can retrieve this events. Note that fedmsg is a whole architecture, so we need some mecanism to subscribe to the bus, and some mecanism to publish on the bus. fedmsg-hub is a daemon used to interact with the fedmsg bus, and it's been used by kiskadee to consume the new releases published by anitya.

Once kiskadee can receive notifications that a new release of some project was made, and this project will be packed to Fedora, we can trigger a analysis without having to watch directly to Fedora repositories. Obviously this is a generic solution, that will analysis several upstream, including upstream that will be packed, but is a first step to achive our goal that is help the QA team and the distribution to monitors the quality of the upstreams that will become Fedora packages.

[1]https://pagure.io/kiskadee
[2]http://www.fedmsg.com/en/latest/
[3]https://release-monitoring.org

GSoC: Why integrates fedmsg with Kiskadee

Posted by David Carlos on June 28, 2017 06:20 PM

On this post we will talk why we decides to integrate Kiskadee [1] with fedmsg [2], and how this integration will enable us to easily monitors the source code of several different projects. The next post will explain how this integration was don.

Exists a initiative on Fedora called Anitya [3]. Anitya is a project version monitoring system, that monitor upstream releases and broadcast them on fedmsg. The registration of a project on Anitya it's quite simple. You will need to inform the homepage, which system used to host the project, and some other informations required by the system. After the registration process, Anitya will check, every day, if a new release of the project were released. If so, it will publish the new release, using a JSON format, on fedmsg. In the context of anitya, the systems used to host projects are called backends, and you can check all the supported backends on this link https://release-monitoring.org/about.

The Fedora infrastructure have several different services that need to talk to each other. One simple exemple is the AutoQA service, that listen to some events triggered by the fedpkg library. If we have only two services interacting the problem is minimal, but when several applications request and response to other several applications, the problem becomes huge. fedmsg (FEDerated MeSsaGe bus) is a python package and API defining a brokerless messaging architecture to send and receive messages to and from applications. Anitya uses this messaging architecture to publish on the bus the new releases of registered projects. Any application that is subscribed to the bus, can retrieve this events. Note that fedmsg is a whole architecture, so we need some mechanism to subscribe to the bus, and some mechanism to publish on the bus. fedmsg-hub it's a daemon used to interact with the fedmsg bus, and it's been used by Kiskadee to consume the new releases published by anitya.

Once Kiskadee can receive notifications that a new release of some project was made, and this project will be packed to Fedora, we can trigger a analysis without having to watch directly to Fedora repositories. Obviously this is a generic solution, that will analysis several upstream, including upstream that will be packed, but is a first step to achieve our goal that is help the QA team and the distribution to monitors the quality of the upstreams that will become Fedora packages.

[1]https://pagure.io/kiskadee
[2]http://www.fedmsg.com/en/latest/
[3]https://release-monitoring.org

Week four: Summer coding report

Posted by squimrel on June 26, 2017 11:04 PM

This was a sad week since I was too ill to work until Thursday evening and gone on the weekend starting from Friday. That being said I did not work on anything apart from a PR to allow the user to specify the partition type when talking to UDisks. (To be honest I’m still ill but I can do work :-).)

Anyways let me explain to you (again) how we’re trying to partition the disk on Linux and why we run in so much trouble doing so.

<figure><figcaption>How we’re trying to partition</figcaption></figure>

The good thing about using UDisks is that it’s a centralized daemon everyone can use over the bus so it can act as an event emitter and it can also manage all the devices and since the bottleneck when working with devices is disk i/o anyways a centralized daemon is not a bad idea.

Let’s focus on using UDisks to partition a disk and the current problem (not the problems discussed in previous reports).

The issue is that libparted thinks the disk is a mac label because of the isohybrid layout bootable ISO images are using so that every system can boot them. Instead it should treat the disk as a dos label. This is important because the maximum number of partitions on a mac label is only 3 and on a dos label it’s 4.

The issue could be fixed by:

  • Fixing libparted.
  • Telling libblockdev to use libfdisk instead of libparted.
  • Not using UDisks at all and instead using libfdisk directly.

But can we actually use the fourth partition on a system that runs MacOS or Windows natively?
This is a legit question because we don’t actually know if adding a partition breaks the isohybrid layout.
I’d guess that it doesn’t and I’d also guess that once the kernel took control the partition table is read “correctly” by Linux and therefore it should detect the fourth partition and work. But I don’t know yet.

Using the proof of concept. I did test this on a VM and on a Laptop that usually runs Linux and in both cases persistent storage worked fine.
At the moment my mentor tests this on a device that usually runs Windows and on a device that usually runs MacOS to see if it works over there and even though the results are not out yet it doesn’t look that good.

If this doesn’t work we’re in big trouble since we’ll have to take a totally different approach to creating a bootable device that has persistent storage enabled.

GSoC2017 (Fedora) — Week 3&4

Posted by Mandy Wang on June 26, 2017 05:28 PM

I went to Guizhou and Hunan in China for my after-graduation trip last week. I walked on the glass skywalk in Zhangjiajie, visited Huangguoshu waterfallss and Fenghuang Ancient City, ate a lot of delicious food at the night market in Guiyang and so on. I had a wonderful time there, welcome to China to experience these! (GNOME Asia, hold in Chongqing, in October is a good choice, Chongqing is a big city which has a lot of hot food and hot girls.)

The main work I did these days for GSoC is carding and detailing the work about establishing the environment for Plinth in Fedora. I realize it by some crude way before, such as using some packages in Debian directly, but now I will make these steps more clear, organize the useful information and write them into INSTALL file.

But my mentor and I had a problem when I tried to run firstboot, I don’t know which packages are needed when I want to debug JS in Fedora, in other words, I want to find which packages in Fedora has the same function with the libjs-bootstrap, libjs-jquery and libjs-modernizr in Debian. If you know how to deal with it, please tell me, I’d be grateful.


Week three: Summer coding report

Posted by squimrel on June 19, 2017 08:22 PM

A tiny PR to libblockdev got merged! It added the feature to ignore libparted warnings. This is important for the project I’m working on since it’s using udisks which uses libblockdev to partition disks and that wasn’t working because of a parted warning as mentioned in my previous report.
Poorly it’s still does not work since udisks tells libblockdev to be smart about picking a partition type and since there’re already three partitions on the disks libblockdev tries to create an extended partition which fails due to the following error that is thrown by parted:

mac disk labels do not support extended partitions.

Let’s see what we’ll do about that. By the way all this has the upside that I got to know the udisks, libblockdev and parted source code.

Releasing and packaging squimrel/iso9660io is easy now since I automated it in a script.

Luckily I worked on isomd5sum before because I need to get a quite ugly patch through so that it can be used together with a file descriptor that uses the O_DIRECT flag.

A checkbox to enable persistent storage has been added to the UI.

So far the time to handle unexpected issues has been available but since next week is the last week before the first evaluations things should definitely work by then.

389 DS development and Git Scenarios

Posted by Ilias Stamatis on June 14, 2017 02:24 PM

DS Development

Let’s see how development takes place in the 389 Directory Server project. The process is very simple, yet sufficient. There is a brief how-to-contribute guide, which also contains first steps to start experimenting with the server: http://www.port389.org/docs/389ds/contributing.html

The project uses git as its VCS and it is officially hosted on pagure; Fedora’s platform for hosting git repositories. In contrast to other big projects, no complex git branching workflow is followed. There’s no develop nor next branch. Just master and a few version branches. New work is integrated into master directly. In case of lib389, only master exists at the moment, but it’s still a new project.

To work on something new you first open a new issue on pagure and you can additionally have some initial discussion about it. When your work is done you can submit a patch into the pagure issue related to your work. You then have to send an e-mail to the developer mailing list kindly asking for review.

After the review request you’re going to receive comments related to the code, offering suggestions or asking questions. You might have to improve your patch and repeat this process a few times. Once it’s good, somebody will set the review status to “ack” and will merge your code to the master branch. The rule is that for something to be merged, one core developer (other than the patch author of course) has to give his permission; give an ACK. The name of the reviewer(s) is always included in commit messages as well.

Working with git

Until now, I’m working only on the lib389 code base. I’m maintaining a fork on github. My local repository has 2 remotes. origin is the official lib389 repository in order for me to pull changes, and the other one called github is the one hosted on github. I’m actually using the github one only to push code that is not yet ready for submission / review.

So, every time I want to work on something, I checkout master, create a new topic branch and start working. E.g.

git checkout master
git checkout -b issue179 # create a new branch and switch to it

If you already have experience with git (branching, rebasing, reseting, etc.) you will probably not find anything interesting below this point. However, I would like to hear opinions/suggestions about the different approaches that can be used in some cases that I describe below. They might probably help me improve the way I work.

Squashing commits

So you’re submitting your patch, but then you need to do some changes and re-submit. Or while working on your fix, you prefer to commit many times with small changes each time. Or you have to make additions for things that you hadn’t thought of before. But the final patch needs to be generated from a single commit. Hence, you have to squash multiple commits into a single one. What do you do?

Personally, since I know that I’ll have to submit a single patch, I don’t bother creating new commits at all. Instead, I prefer to “update” my last commit each time:

git add file.py # stage new changes
git commit --amend --date="`date`"

Notice that I like to update the commit date with the current date as well.

Then I can push my topic branch to my github remote:

git push github issue179 --force

I have to force this action since what I actually did previously was to rewrite the repository’s history. In this case it’s safe to do it because I can assume that nobody had based their work on this personal topic branch of mine.

But actually what I described above wasn’t really about squashing commits, since I never create more than one. Instead, there are 2 ways that I’m aware of, that can be used when you have committed multiple times already. One is by using interactive rebasing. The other is by using git reset. Both approaches and some more are summed up here: https://stackoverflow.com/questions/5189560/squash-my-last-x-commits-together-using-git

Rebasing work

You started working on an issue but in the meanwhile other changes have been merged into master. These changes probably affect you, so you have to rebase your work against those changes. This actually happened to me when we decided to base new work on python3 on lib389. But this was after I had already started working on an issue, so I had to rebase my work and make sure that all new code is now python3 compatible.

The process is normally easy to achieve. Instead of merging I usually prefer to use git rebase. So, if we suppose that I’m working on my issue179 branch and I want to rebase it against master all I really have to do is:

git checkout issue179
git rebase master

Rebasing makes for a cleaner history. If you examine the log of a rebased branch, it looks like a linear history and that’s why I normally prefer it in general. It can be dangerous sometimes though. Again, it is safe here, assuming that nobody else is basing work on my personal topic branch.

A more complex scenario

Let’s suppose that I have made some work on a topic branch called issue1 but it’s not merged yet; I’m waiting for somebody to review it. I want to start working on something else based on that first work done in issue1. Yet I don’t want to have it on the same patch, because it’s a different issue. So, I start a topic branch called issue2 based on my issue1 branch and make some commits there.

Then, a developer reviews my first patch and proposes some changes, which I happily implement. After this, I have changed the history of issue1 (because of what I had described above). Now issue2 is based on something that no longer exists and I want to rebase it against my new version of issue1 branch.

Let’s assume that ede3dc03 is the checksum of the last commit of issue2; the one that reflects the diff between issue2 and old issue1. What I do in this case is the following:

git checkout issue1 # new issue1
git checkout -b issue2-v2
git cherry-pick ede3dc03
git branch -D issue2 # delete old issue2 branch
git branch --move issue2-v2 issue2 # rename new branch to issue2

A cherry-pick in git is like a rebase for a single commit. It takes the patch that was introduced in a commit and tries to reapply it on the branch you’re currently on.

I actually don’t like this approach a lot, but it works for now. I’m sure there are more approaches and probably better / easier ones. So, I would be very glad to hear about them from you.

Creating the patch

So after doing all the work, and having it in a single commit pointed by HEAD, the only thing we have to do is create the patch:

git format-patch HEAD~1

Please always use format-patch to create patches instead of git diff, Unix diff command or anything else. Also always make sure to run all tests before submitting a patch!


Week two: Summer coding report

Posted by squimrel on June 12, 2017 05:31 PM

My PRs to rhinstaller/isomd5sum got merged! Which caused the 1.2.0 release to fail to build. Bam! I’m good at breaking things.
There’s a commit that makes this a proper dependency of the MediaWriter.

I had a look at packaging because package bundling is not cool according to the guidelines. Which means that I’ll have to package squimrel/iso9660 if I want to use it.

You can now make install squimrel/iso9660 and it’ll correctly place the shared library and header file.

The helper part of the FMW was reorganized but poorly I’m stuck at adding the partition due to this error:

Failed to read partition table on device ‘/dev/sdx’

reported by libblockdev due to this warning:

Warning: The driver descriptor says the physical block size is 2048 bytes, but Linux says it is 512 bytes.

reported by libparted most likely due to the sector size of 2048 bytes of iso images.

The Windows and Mac build fail on the Linux-only development branch since I broke them on purpose.

Up next:

  • Somehow add the partition.
  • Merge the dev branch of squimrel/iso9660.
  • Create a .spec file for squimrel/iso9660.
  • Make implantISOFD work using an udisks2 file descriptor.
  • Look at what’s next.

GSoC: First week of oficial development

Posted by David Carlos on June 08, 2017 04:11 PM

This post will be a simple report of the first official development week on GsoC program. Kiskadee is almost ready for the first release, missing only a documentation review, and a CI configuration with Jenkins. The next image shows the current architecture of Kiskadee [1]:

In this first release, Kiskadee already is able to monitor a Debian mirror, and the juliet [2] test suite. For this two targets, two plugins were implemented. We will talk briefly of each Kiskadee component.

Plugins

In order to monitor different targets, Kiskadee permits integrate plugins in it's architecture. Each plugin will tell Kiskadee how some target must be monitored, and how the source code of this target will be downloaded.

We have defined a common interface that a plugin must follow, you can check this on Kiskadee documentation [3] .

Monitor

The monitor component is the entity that controls which monitored packages needs to be analyzed. The responsibilities of the monitor are

  • Dequeue packages queued by plugins.
  • Check if some dequeued package needs to be analyzed.
    • A package will be analyzed if it does not exists in the database, or if it exists but have a new version.
    • Save new packages and new package versions in database.
  • Queue packages that will be analyzed by the Runner component.

We are using the default python implementation for queue, since the main purpose of this first release is to guarantee that Kiskadee can monitor different targets, and run the analysis.

Runner

The runner component is the entity that trigger the analysis on the packages queued by monitor. This trigger is made using docker. In this release we are calling the container directly, and running the static analyzer inside of it, passing the source code as a parameter. For now we only have support to Cppcheck tool. After we run the analysis, we parse the tool output using Firehose tool [4], and saving this parsed analysis on the database. We also updates the package status, informing that a analysis was made.

The next post will be a roadmap, to the next Kiskadee release.

[1]https://pagure.io/kiskadee
[2]https://samate.nist.gov/SRD/testsuite.php
[3]https://pagure.io/docs/kiskadee/
[4]https://github.com/fedora-static-analysis/firehose

dsconf: Adding support for MemberOf plug-in

Posted by Ilias Stamatis on June 07, 2017 10:43 AM

Directory Server supports a plug-in architecture. It has a number of default plug-ins which configure core Directory Server functions, such as replication, classes of service, and even attribute syntaxes, but users can also write and deploy their own server plug-ins to perform whatever directory operations they need.

How do we configure these plug-ins at the moment? By applying LDIF files. Is this easy and straightforward for the admin? Nope.

I’m currently working on this ticket for adding support of the memberOf plugin into lib389: https://pagure.io/lib389/issue/31 What we want to achieve here is to be able to fully configure the plug-in using dsconf, a lib389 command line tool.

So, for example, the simple act of enabling the memberOf plugin becomes:

dsconf ldap://example.com memberof enable

Currently, if we want to achieve the same we have to apply the following LDIF file to the server using a tool such as ldapmodify:

dn: cn=MemberOf Plugin,cn=plugins,cn=config
changetype: modify
replace: nsslapd-pluginEnabled
nsslapd-pluginEnabled: on

The former is much more simple and intuitive, right?

More examples of using dsconf to interact with memberOf:

dsconf ldap://example.com memberof show   # display configuration
dsconf ldap://example.com memberof status # enabled or disabled
dsconf ldap://example.com memberof fixup  # run the fix-up task
dsconf ldap://example.com memberof allbackends on # enable all backends
dsconf ldap://example.com memberof attr memberOf  # setting memberOfAttr

But that’s not all. Additionally, I will write unit tests for validating the plug-in’s functionality. That means checking that the C code – the actual plug-in – is doing what it is supposed to do when its configuration changes. Again, we are going to test the C code of the server using our python framework. That makes it clear that lib389 is not only an administration framework, but it is used for testing the server as well.

In the meanwhile, while working on memberOf support, I have discovered a lot of issues and bugs. One of them is that when the plug-in is disabled it doesn’t perform syntax-checking. So somebody could disable the plug-in, set some illegal attributes and then make the server fail. We’re going to fix this soon, along with more issues.

Until now I have raised the following bugs related to memberOf:
https://pagure.io/389-ds-base/issue/49283
https://pagure.io/389-ds-base/issue/49284
https://pagure.io/389-ds-base/issue/49274

This pretty much is how my journey begins. As I promised, in my next post I’ll talk about how the 389 DS development is taking place.


Week one: Summer coding report

Posted by squimrel on June 05, 2017 10:55 PM

Since I got to know everything I need to know in the community bonding period I could jump right into writing source code from day one.

The first three days were like a marathon. I stayed up for up-to 24 hours and my longest continuous coding session lasted 13 hours.

After those three days the part which I considered the most complex part of the project at the time was done. I was happy about that because even though in my feasibility study I have already declared this project as feasible but now I had the proof so I could calm down and relax.

Then I spent some time to look at how I’d add a vfat partition and add the overlay file to it. Since this is the platform specific part I looked at how I’d do that on Linux first.

Using libfdisk this worked just fine and I even figured out how to skip user interaction but poorly I couldn’t find a library that would create a vfat filesystem on the partition. Luckily my mentor pointed me to udisks. Therefore I discarded the idea of using libfdisk since I’ll use udisks instead.

In the mean time I’ve been working now and then on squimrel/iso9660 and I’ve also addressed requested changes to the PR to isomd5sum on which I worked on during community bonding period. It’s not directly project related but I’d be great if we’d be able to use this as a proper dependency in the FMW instead of bundling.

During next week the FMW helper code will be restructured a tiny little bit so that it’s easier to integrate squimrel/iso9660 since that’s cross-platform code.
Also Linux specific code that adds a vfat partition to the portable media device using udisks will be added.

GSoC2017 (Fedora) — Week 1

Posted by Mandy Wang on June 05, 2017 03:58 PM

I’m very exciting when I got the email that I was accepted by Fedora in GSoC2017. I will work for the idea – Migrate Plinth to Fedora Server – this summer.

I attend my graduation thesis defense today, and I have to spend most of my time on my graduation project last week, so I only done a little bit of work for GSoC in the first week. I will officially start my work this week – migrate the first set of modules from Deb-based to RPM-based.

This is the rough plan I made with Mentor Tong:

First Phrase

  • Before June 5, Fedora wiki {Plinth (Migrate Plinth from Debian)}
  • June 6 ~ June 12, Coding: Finishing LDAP configuration First boot module
  • June 13 ~ June 20, Finish User register and admin manager
  • June 21 ~ June 26, Adjust Unit Test to adopt RPM and Fedora packages
  • Evaluation Phrase 1

Second Phrase

  • June 27 ~ July 8, Finish system config related models
  • July 9 ~ July 15, Finish all system models
  • July 16 ~ July 31, Finish one half APP models
  • Evaluation Phrase 2

Third Phrase

  • August 1 ~ August 13, Finish other app models
  • Final Test and finish wiki
  • Final Evaluation

My project

Posted by Ilias Stamatis on June 05, 2017 12:09 AM

In my previous blog post I mentioned that I’m working on the 389 Directory Server project. Here I’ll get into some more details.

389 Directory Server is an open-source, enterprise class LDAP Directory Server used in businesses globally and developed by Red Hat. For those who are not familiar with directory services, an LDAP server basically is a non-relational, key-value database which is optimized for accessing, but not writing, data. It is mainly used as an address book or authentication backend to various services but you can see it used in a number of other applications as well. Red Hat additionally offers a commercial version of 389 called Red Hat Directory Server

The 389 Project is old with a large code base that has gone through nearly 20 years of evolution. Part of this evolution has been the recent addition of a python administration framework called lib389. This is used for the setup, administration and testing of the server, but it’s still a work in progress.

Until now, the administration of the server has always been a complex issue. Often people have to use the Java Console or apply LDIF files, both of which have drawbacks. Also, there is a variety of helper perl scripts that are installed along with the DS, but unfortunately the server cannot be managed with them alone. The primary goal of lib389 is to replace all these legacy scripts and add capabilities that are not currently provided. It has to be a complete one stop while command line focused.

So, much of my work will be lib389-related. Fortunately there is no strict list of tasks to follow. The project offers much freedom (thanks William!) so I actually get to choose what I want to work on! I have begun by adding configuration support for a plug-in. I’ll soon explain what this means on my next post. At a later stage I might do some work for the C code base and “move” from lib389 to the actual DS. I’m already really looking forward to it!

This was an overview of my project in general and I hope that I managed to effectively explain what it is about. Once more I haven’t given many details, but I’ll dive into more specific issues over the upcoming weeks. Additionally, I’ll publish a blog post explaining how the 389 DS development is done and discussing my personal work-flow as well.

Happy GSoC!


The config files on the ISO image can be modified

Posted by squimrel on June 01, 2017 07:23 AM

Mission accomplished!

At least kind of. There has been some work done in the past three days.
The result is a very tiny partial implementation of ECMA-119 that can grow files on the ISO 9660 image by a couple bytes if it’s lucky.

CD-ROM standards from 1987-1997

Posted by squimrel on May 30, 2017 02:46 PM

Specifically ECMA-119 (1987), ISO 9660:1988, ISO 9660:1995, Joliet (1988), IEEE P1282 (Rock ridge draft — 1994), El Torito (1995). I also looked a little bit into ECMA-167 (1994) and ECMA-167 (1997) and I ignored all revisions of UDF.

Do I need to implement all those standards?

No, but I could if I’m bored. It took me quite a while to figure out that I actually only need ECMA-119, Joliet, IEEE P1282 and El Torito.

That’s also just theoretically. Since I practically only need to modify two files I most likely don’t have to implement anything according to Joliet, IEEE P1282 or El Torito.

Modifying files on a fixed size ISO layout sounds expensive? Well most likely we’re lucky since files are sector aligned (2048 byte) and the rest is padded out with zeros. If we’re lucky we don’t even need to move anything around on the image.

Why implement ECMA-119 even though it’s the oldest standard?

Because that’s what the ISO 9660 image I’ll have to handle uses.

What’s about ISO 9660?

This 2nd Edition of Standard ECMA-119 is technically identical with ISO 9660. — ECMA-119

That’s correct but only when referring to ISO 9660:1988. I feel like ECMA-119 provides much more detail and it’s also often refereed to from ECMA-167. Note that .iso files are generally referred to as an ISO 9660 image and that doesn’t seem to change.

Yes there’s a specified way to detect which standard the ISO 9660 image follows by the way.

I’d guess that ECMA-119 is used because the oldest standard must be the most portable one.

Coding officially started

You can follow my active development in the first two weeks on squimrel/iso9660.

Welcome to Google Summer of Code

Posted by David Carlos on May 30, 2017 12:24 AM

This is the first post of this blog, and as the first post I would like to announce that I was accepted in Google Summer of Code (GsoC) this year. GsoC it's a Google program that occurs during the summer, with the objective to encourage students all over the world, to contribute to free and open-source software. The students choose some organization to contribute to, and submit a proposal to a new project, or to a existent project proposed by the organization. The organization that I submitted a proposal was Fedora, a Red Hat like Linux Distribution, developed by a great community of contributors from around the world. I have been using Linux for a long time, and now it's time to really start contributing with the community, helping to track the quality of the source code, that goes inside each package made available by Fedora.

As a software developer I always had the interest in static analysis, and the benefit that such practice can bring to the cycle of software development. Many tools were developed for this purpose, but a system that permits the developer to easily integrates such analysis in their development cycle, does not exists. A tool that tried accomplish this goal was [Debile](https://wiki.debian.org/Services/Debile), developed in the context of the Debian Distribution. The main problem of Debile is to be a lot coupled to Debian infrastructure, not allowing to run the analysis system in others sources of code. Our proposal, to Google Summer of Code is to build a extensible system, that with a few steps of the developer, can continuously monitors and collect data of static analysis from different sources of code. This idea was initially discussed in the devel mailing lists, by the guys in the [Static Analysis SIG](https://fedoraproject.org/wiki/StaticAnalysis), on Fedora. All the data collected will be stored in a database, and made available to the developers that want to monitor the quality of some source code. This system will be called Kiskadee [1], and already can monitors some Debian mirrors. Now our objective is to monitor the Fedora repositories, and integrates more static analyzers to Kiskadee.

You can read my proposal [here](https://goo.gl/uEB2Qk) to understand better our objective, and follow the development process [here](http://pagure.io/kiskadee)

I will make weekly posts, reporting the status of Kiskadee development. Let's Code :)

[1] the great Kiskadee is a bird that watches its prey (usually bugs) and catch.

What This is About

Posted by Ilias Stamatis on May 27, 2017 07:57 PM

This blog is about my Google Summer of Code experience in the summer of 2017.

For those who may not know, GSoC is an international annual program organized by Google open to university students. Every year there is a number of open-source organizations that participate in the program and students can apply for them by creating project/idea proposals. If accepted, students have to work all over the summer on their projects by writing code, documentation and tests. They also get to collaborate with the open-source community of their organization and their mentors who are the there to guide them through the process. Google awards stipends to all students who successfully complete their project during the summer.

Example open-source organizations for this year include The Linux Foundation, Mozilla, The GNU Project, GNOME, KDE, FreeBSD, Apache Software Foundation and many more. The full list of all 201 accepted organizations can be found here: https://summerofcode.withgoogle.com/organizations/

I’m currently participating in the GSoC program with the Fedora Project. More specifically, I’m working with the Red Hat team that develops the 389 Directory Server, helping with the development of a python framework for easier administration of the DS. In future posts I’m going to give more details about my project, what I have done already and what are the plans for this summer.

I’m going to post weekly updates in this blog about what I’m doing, what I’m learning and about the whole experience of being a GSoC participant in general.

Stay tuned!


Proof-of-concept of approach #1

Posted by squimrel on May 26, 2017 02:16 AM

Or how the Fedora Media Writer will create persistent storage enabled devices

As mentioned before the Fedora Media Writer should not depend on command line tools.

I talked about the different approaches on how to accomplish this task already. But it hasn’t been proven that they work yet.
I’ll go ahead and prove approach #1 since my mentor and I decided that I should work on this approach first.

The naive idea of approach #1 is to:

  1. Extract the ISO.
  2. Modify two configuration files to enable persistent storage.
  3. Build a new ISO.
  4. Write the new ISO on the portable media device.
  5. Add a partition to the portable media device that will be used for persistent storage.

To prove this I went ahead and wrote a bash script that does exactly that:

<iframe frameborder="0" height="0" scrolling="no" src="https://medium.com/feed/@squimrel" width="0">https://medium.com/media/7d3eff5be34a73e8fcf53fb7f627dfbe/href</iframe>

I tested if the portable media device did in-fact have persistent storage enabled. And it works!

Glad to be a Mentor of Google Summer Code again!

Posted by Tong Hui on May 25, 2017 04:44 PM
This year I will mentoring in FedoraProject, and help Mandy Wang finish her GSoC program about “Migrate Plinth to Fedora Server” which raised by me. While, why I proposed this idea? Plinth is developed by Freedombox which is a Debian based project. The Freedombox is aiming for building a 100% free software self-hosting web server to … Continue reading "Glad to be a Mentor of Google Summer Code again!"

How the overhead could be reduced

Posted by squimrel on May 23, 2017 03:56 AM

An approach to ISO manipulation

I talked about different approaches on how to manipulate an ISO before. In approach one I particularly complained about how unhappy I am with the expected overhead.

When I started to look at source code that actually deals with ISO 9660 I realized that I got the power! I’m not limited to command-line tools so who says that I actually have to extract everything from the ISO to modify it?

To read files from an ISO I can use libcdio. To build one with “El Torito” and all that fancy stuff I’d have to use cdrtools.

Build a bootable ISO using cdrtools

Poorly cdrtools is not build in a way for me to easily use it. It still lives in an SCCS repository and it’s designed in a way that mkisofs works not so that I can easily reuse its source code. I it’s simply not a library. Also the code contains stuff like:

<iframe frameborder="0" height="0" scrolling="no" src="https://medium.com/feed/@squimrel" width="0">https://medium.com/media/b6b81b68aa8fbc056f34b175d4e58883/href</iframe>

I mean I get it. This project is older than me and no one likes to work on it so there’s no one to blame.

By the way Fedora ships genisoimage which is part of cdrkit and cdrkit is a fork of cdrtools which lives in a git repository since 2006 but is not maintained since 2010. On the other hand cdrtools is under “active” development. Anyways in both repositories the situation is equally bad.

Build from scratch

Since I don’t want to reuse the source code I’ll most likely read the specification of ISO 9660 and “El Torito” and build what I need from scratch.

Ideally the program would calculate the md5sum and apply isohybrid as part of the build process so that no post-processing is required.

Drop isohybrid

Probably isohybrid is not even required since if what I proposed will be implemented well enough the program should be able to go ahead and modify what is needed without destroying the isohybrid layout which is already provided by the released ISO.

Overwhelmed

Obviously I won’t try to do this all at once but design with this idea in mind. Most likely I’ll look at how to build an ISO from the extracted files first before I’ll do any of the fancy stuff I proposed.

Coding will start on the May 30. Until then I should’ve read the specifications.

Reading source code that creates bootable devices

Posted by squimrel on May 19, 2017 10:03 PM

Someone must have figured out how to properly create a bootable media device right? Cross-platform? Not depending on a lot external tools?

From the source code I’ve read so far there’re basically the following two categories:

  • Linux only and depends on a bunch of Linux command line tools.
  • Horrifying source code and only works on Windows and/or Mac by porting some Linux command line tools or using other equivalent vendor tools and binaries that pop out of nowhere.

Well guess what. I’d like to avoid to call any external program if somehow possible. Only last year the Fedora Media Writer moved away from having dd as a dependency. Let’s keep it that way.

Maybe it’s a good thing that I’m on my own. Otherwise I’d do something someone has already done before and that would be evil right?

Hello Summer Coding World

Posted by squimrel on May 18, 2017 12:08 PM

I’m a nobody who accomplished nothing in life. You may call me squimrel.

As the title suggests I’ll write source code this summer. Specifically I’ll work on the Fedora Media Writer project which is able to write live Fedora images on portable media devices.

One (me) should make it possible so that the images on the portable drives have persistent storage enabled. This will be achieved by modifying two configuration files and adding another partition for persistent storage.

It’s not clear how exactly the program should modify the configuration files on the live image in which the flags for persistent storage can be set yet. There are multiple approaches and every single one of them comes with different challenges about which I’ll talk when I face them.

It’s doesn’t have to be said that making this work on Windows and Mac comes as an extra challenge itself.

I’m a bit uncertain how this project will work out but I’m confident that source code will be written and that I’ll learn a lot about accomplishing this task using C++.

If you wish to know more about me, my “experience”, etc. you can read my summer coding application at the Fedora Project Wiki.

Refactor a project written in C

Posted by squimrel on May 17, 2017 08:14 AM

Namely isomd5sum. The first commit to it to the repository it previously lived in was in 2002. From the commit message “Add isomd5sum stuff to anaconda” one could guess that the origin was a different one so it might be older than that.

isomd5sum is a very small project and as most old projects written in C it works just fine. Over the years 13 different users created 111 commits. The last one was merged two years ago in 2015.

So it’s not a very active project. That’s not bad either because it solves the problem it’s supposed to solve already:

isomd5sum provides a way of making use of the ISO9660 application data area to store md5sum data about the iso.

Why refactor?

Well right there’s no particular reason to do that especially since there’re only minor bugs like forgetting to free a kilobyte of memory before exit or using more memory than needed to make it easier to avoid a potential buffer-overflow when passed a manipulated ISO. These kind of errors can be changed without refactoring everything.

But this project is a dependency for the Fedora Media Writer project and I’d like Fedora Media Writer to use it as a proper dependency using git submodules. Currently it’s just a copy of the projects source code with some commits added on top of it.

To make this happen I’ll need to apply a couple patches to isomd5sum anyways and I prefer working on source code I don’t hate too much. So better refactor everything so that I’ll have to hate myself if I’m still unhappy with it.

Also the project does need refactoring since it’s full of magic values and other things that we call bad practices nowadays.

So even though the refactor does not improve the functionality. (It might even have introduced bugs. You can never be sure.) It simplifies and does a lot of the changes that I will have to make anyways.

Currently my PR is still under review. I’d be happy if you’d help reviewing it.

3D Printing Fun

Posted by Mo Morsi on May 14, 2017 06:52 PM

The PI Sw1tch project took a bit of a setback recently when we discovered the 5200mAH usb battery we had wasn't sufficient to drive the PI + display for any extensive period of time. I've ordered a higher-capacity battery from Amazon, but until it arrives the working prototype was redesigned into a snappable wall mount:

Wall pi2 Wall pi1

The current implementation can be found on thingiverse for all your 3D printing needs!

Additionally, we threw together a design for a wall mount for my last smartphone, the Samsung Intercept, which has been sitting on my shelf since I upgraded to the Huawei Union. The Intercept was a great phone, and it still works well, albiet a bit slow compared to modern devices (not to mention it only runs Android 2.2). But it more than suffices for a "smart" entertainment hub, and having mounted & wired up to my stereo system, I now have easy access to all the albums in the world (...that are available via youtube...). The device supposidly can be rooted, though I was not able to accomplish that myself and don't really care to spend more time figuring that out (really wouldn't gain much). But just goes to show how a little inguinity + some design work can go along way at reducing e-waste.

Wall intercept2 Wall intercept1

Now time to figure out what to do w/ my ancient Droid (the original A855!)

Keep on hacking!

RetroFlix - A Weekend Project

Posted by Mo Morsi on May 14, 2017 06:52 PM

Now that we have the 'mount' component of the PI Sw1tch, and an awesome way to playgames through the PI on our TV, we need a collection of games to play! It goes without saying that Nethack was installed (combined w/ ssh X11 forwarding = persistent graphical nethack anywhere = epicness!!!). But I also happen to have a huge box of Retro video games (dating back to my childhood), which would be good to load onto the device. But unfortunately cloning so many games would take some time, and there are already online databases of these backups, so I opted to write a small web app to download and manage the collection.

It can be found here and you can see some screens below:

My library Game info Game previews Game list

It was built as a Sinatra web service, simply acting as a frontend to a popular emulator database, allowing the user to navigate & preview app for various systems, and download / run them locally. The RetroFlix application itself is offered as a lightweight Microservice simply acting as a proxy to the required various underlying components. It's fairly simple to setup & install (see the README), and builds upon existing emulators & components the user has locally.

As with everything else, it's still a work in progress, but it already sufficing to relive classic memories!

<iframe frameborder="0" height="315" src="https://www.youtube.com/embed/QhH_iibOJv0" width="560"></iframe>

Approaches to enable persistent storage on ISOs

Posted by squimrel on May 11, 2017 12:58 PM

Basically just two configuration files needs to be modified. The question is how to modify the configuration file on an ISO using C++?

Ideally one would like to have an easy to port solution without any dependencies, no extra time spent and obviously it should be beautiful and extendable too.

I didn’t find such an approach yet. If you have a good approach in mind please query me on IRC.

Currently writing the ISO to disk works like this:

<iframe frameborder="0" height="0" scrolling="no" src="https://medium.com/feed/@squimrel" width="0">https://medium.com/media/440d363ac75d755193c7febc29e1abf6/href</iframe>

It’s great since it doesn’t need any dependencies. There’s a different implementation for every supported platform but that’s okay.

Short summary:

  1. Write ISO to disk.
  2. Calculate checksum of ISO written to disk to verify that everything worked correctly.

1. Create an ISO that is already configured in a way so that persistent storage is enabled

Unluckily ISO 9660 was designed to be read-only (correct me if I’m wrong). So one can’t just mount it and modify a config file.

Instead the program has to:

  1. Extract the files from the ISO.
  2. Modify the config file.
  3. Build a new ISO.

This would require at least the following two dependencies: libcdio (for iso9660) and syslinux (for isohybrid).

Yield the following overhead (n = Size of the ISO):

  • 2n disk space (ISO + extracted ISO)
  • 4n disk read (read ISO, read extracted ISO, read ISO, calculate checksum)
  • 3n disk write (extract ISO, pack ISO, write ISO to portable media)

and therefore it would take around 1.4 times longer than it already takes.

2. Modify the config file in the ISO through byte manipulation

Yes it’s a bad idea and it’s evil but maybe it works.

Why it’s a bad idea:

  • It’ll break the inner checksum of the ISO.
  • Stuff such as comments have to removed from the config file so that there’s enough space to add the corresponding flags.
  • It’s very likely that it’s unreliable and buggy in special cases.
  • It’s hard and messy to extend.

This would take just a little longer than it currently does because the manipulation could be done while writing to the portable media device.

3. Make the portable media device bootable from scratch and copy what’s needed from the ISO

This is very similar to livecd-iso-to-disk approach but it will be hard to implement in a way that it can be ported to platforms other than Linux.

Also for this approach even more dependencies than for #1 are required.

Note

Obviously the time it takes to create the overlay partition still has to be taken into account in every approach.

A New Site, A Fresh Start

Posted by Mo Morsi on April 29, 2017 03:59 AM

I started this blog 10 years ago. How the world has changed... (and yet is still the same...)

Recently I noticed my site was unaccessible. No 404, no error response, just a blank page. After a brief moment of panic, I ssh'd into my host provider and breathed a sign of relief upon discovering all db & fs entities intact, including the instance of Drupal which my site was based on (a horribly outdated instance mind you, and note I said was based). In any case, I suspect my (cheap) host provider updated their version of PHP or some critical library w/ the effect of my Drupal instance not working.

Dogefox

Having struggled w/ PHP & Drupal many times over the years, I was finally ready to go cold turkey, and migrated the blog to Middleman, which brings the awesomeness of Rails to static site generation. I am very much in love with Middleman right now, it's the perfect tool for this problem domain, it's incredibly easy to setup a new site, use any high level templating / markup / styling language to customize your frontend, throw in any js or other framework to handle dynamic interactions (including emscripten to run C in the browser), and you're good to go. Tailoring things on the fly is a cinch due to the convenient embedded webserver sporting live-reloading, and when you're ready to push to production it's a single command to build the static html. A quick rsync -azP synchronizes it w/ your webserver and now your site is available to the world at blazing speeds!

Anyways, enough Middleman gushing (but seriously check it out!). In addition to the port, I rethemed the site, be sure to also check out the new design if your reading this via rss. Note mobile browser UI's aren't currently supported, so no angry emails if you can't read things on your phone! (I know their coming...)

Be sure to stay subscribed to github for updates. I'm hoping virtfs-refs will see some love soon if I can figure out how to extend the current fs parsing mechanisms w/ file content retrieval. We've also been prototyping designs for the PI Switch project I mentioned a while back, more updates on that soon as things progress.

Keep surfing!!!

Compiling / Playing NetHack 3.6.0 on Fedora

Posted by Mo Morsi on April 26, 2017 08:43 PM

The following are the simplest instructions required to compile NetHack 3.6.0 for Fedora 25.

Why might you want to compile NetHack from source, instead of simply installing the package (sudo dnf install nethack)? For many reasons. Applying patches for custom game mechanics. Running an alternate frontend. And more!

While the official Linux instructions are complete, they are pretty involved and must be followed exactly for things to work. To give the dev team credit, they’ve been supporting a plethora of platforms and environments for 20+ years (and the number is still increasing). While a consolidated guide was written for compiling NetHack from scratch on Ubuntu/Debian but nothing exists for Fedora… until now!


# On a fresh Fedora installation (with updates) install the dependencies:

$ sudo dnf install ncurses-devel libXt-devel libXaw-devel byacc flex

# Download the NetHack (3.6.0) source tarball from the official site and unpack it:

$ tar xzvf [download]
$ cd nethack-3.6.0/

# Run the base setup utility for Linux:

$ cd sys/unix
$ ./setup.sh hints/linux
$ cd ../..

# Edit [include/unixconf.h] to uncomment the following line…

#define LINUX

# Edit [include/config.h] to uncomment the following line…

#define X11_GRAPHICS

# Edit [src/Makefile] and update the following lines…

WINSRC = $(WINTTYSRC)
WINOBJ = $(WINTTYOBJ)
WINLIB = $(WINTTYLIB)

# …to look like so

WINSRC = $(WINTTYSRC) $(WINX11SRC)
WINOBJ = $(WINTTYOBJ) $(WINX11OBJ)
WINLIB = $(WINTTYLIB) $(WINX11LIB)

# Edit [Makefile] to uncomment the following line

VARDATND = x11tiles NetHack.ad pet_mark.xbm pilemark.xpm rip.xpm

# In previous line, apply this bugfix by changing…

pilemark.xpm

# …to

pilemark.xbm

# Build and install the game

$ make all
$ make install

# Finally create [~/.nethackrc] config file and populate it with the following: OPTIONS=windowtype:x11


# To play:

$ ~/nh/install/games/nethack

Go get that Amulet!

Project Idea - PI Sw1tch

Posted by Mo Morsi on April 25, 2017 12:07 PM

While gaming is not high on my agenda anymore (... or rather at all), I have recently been mulling buying a new console, to act as much as a home entertainment center as a gaming system.

Having owned several generations PlayStation and Sega products, a few new consoles caught my eye. While the most "open" solution, the Steambox sort-of fizzled out, Nintendo's latest console Switch does seem to stand out of the crowd. The balance between power and portability looks like a good fit, and given Nintendo's previous successes, it wouldn't be surprising if it became a hit.

In addition to the separate home and mobile gaming markets, new entertainment mechanisms are needing to provide seamless integration between the two environments, as well as offer comprehensive data and information access capabilities. After all what'd be the point of a gaming tablet if you couldn't watch Youtube on it! Neal Stephenson recently touched on this at his latest TechCrunch talk, by expressing a vision of technology that is more integrated/synergized with our immediate environment. While mobile solutions these days offer a lot in terms of processing power, nothing quite offers the comfort or immersion that a console / home entertainment solution provides (not to mention mobile phones being horrendous interfaces for gaming purposes!)

Being the geek that I am, this naturally led me to thinking about developing a hybrid mechanism of my own, based on open / existing solutions, so that it could be prototyped and demonstrated quickly. Having recently bought a Raspeberry PI (after putting my Arduino to use in my last microcontroller project), and a few other odds and end pieces, I whipped up the following:

Pi sw1tch

The idea is simple, the Raspberry PI would act as the 'console', with a plethora of games and 'apps' available (via open repositories, steam, emulators, and many more... not to mention Nethack!). It would be anchorable to the wall, desk, or any other surface by using a 3D-printed mount, and made portable via a cheap wireless controller / LCD display / battery pack setup (tied together through another custom 3D printed bracket). The entire rig would be quickly assemblable and easy to use, simply snap the PI into the wall to play on your TV; remove and snap into the controller bracket to take it on the go.

I suspect the power component is going to be the most difficult to nail down, finding an affordable USB power source that is lightweight but offers sufficient juice to drive the Raspberry PI w/ LCD might be tricky. But if this is done correctly, all components will be interchangeable, and one can easily plug in a lower-power microcontroller and/or custom hardware component for a tailored experience.

If there is any interest, let me know via email. If 3 or so people commit, this could be done in a weekend! (stay tuned for updates!)

Nethack Encyclopedia Reduxd

Posted by Mo Morsi on April 24, 2017 05:23 PM

I've been working on way too many projects recently... Alas, I was able to slip in some time to update the NetHack Encyclopedia app on the Android MarketPlace (first released nearly 5 years ago!).

Version 5.3 brings several features including new useful tools. The first is the Message Searcher that allows the user to quickly query the many cryptic game messages by substring & context. Additionally the Game Tracker has been implemented, faciliting player, item, and level identification in a persistant manner. Simply enter entity attributes as they are discovered and the tracker will deduce the remaining missing information based on its internal alogrithm. This is ontop of many enhancements to the backend including the incorporation of a searchable item database.

The logic of the application has been highly refactored & cleaned up, the code has come along ways since first being written. In large, I feel pretty comfortable with the Android platform at the current time, it has its nuances, but all platorms do, and it's pretty easy to go from concept to implementation.

As far as the game itself, I have a ways to go before retrieving the Amulet! It's quite a challenge, but you learn with every replay, and thus you get closer. Ascension will be mine! (someday)

Nethack 5.3 screen1 Nethack 5.3 screen2 Nethack 5.3 screen3 Nethack 5.3 screen4

Lessons on Aikido and Life via Splix

Posted by Mo Morsi on April 24, 2017 05:23 PM

Recently, I've stumbled upon splix, a new obsession game, with simple mechanics that unfold into a complex competitive challenge requiring fast reflexes and dynamic tactics.

Splix intro

At the core the rule set is very simple: - surround territory to claim it - do not allow other players to hit your tail (you lose... game over)

Splix overextended

While in your territory you have no tail, rendering you invulnerable, but during battles territory is always changing, and you don't want to get caught deep on an attack just to be surrounded by an enemy who swaps the territory alignment to his!

Splix deception

The simple dynamic yields an unbelievable amount of strategy & tactics to excel at while at the same time requiring quick calculation and planning. A foolheardy player will just rush into enemy territory to attempt to capture squares and attack his opponent but a smart player will bait his opponent into his sphere of influence through tactful strikes and misdirections.

Splix bait

Furthermore we see age old adages such as "better to run and fight another day" and the wisdom of pitting opponents against each other. Alliances are always shifting in splix, it simply takes a single tap from any other player to end your game. So while you may be momentarily coordinating with another player to surround and obliterate a third, watch your back as the alliance may dissove at the first opportunity (not to mention the possiblity of outside players appearing anytime!)

Splix alliance

All in all, I've found careful observation and quick action to yield the most successful results on the battlefield. The ideal kill is from behind an opponent who has periously invaded your territory deeply. Beyond this, lurking at the border so as the goad the enemy into a foolheardy / reckless attack is a robust tactic provided you have built up the relfexes and coordination to quickly move in and out of territory which is constantly changing. Make sure you don't fall suspect to your own trick and overpenetrate the enemy border!

Splix bait2

Another tactic to deal w/ an overly aggressive opponent is to slightly fallback into your safe zone to quickly return to the front afterwords, perhaps at a different angle or via a different route. Often a novice opponent will see the retreat as a sign of fear or weakness and become over confident, penetrating deep into your territory in the hopes of securing a large portion quickly. By returning to the front at an unexpected moment, you will catch the opponents off guard and be able to destroy them before they have a chance to retreat to their safe zone.

Splix draw out

Of course if the opponent employs the same strategy, a player can take a calculated risk and drive a distance into the enemy territory before returning to the safe zone. By paying attention to the percentage of visible territory which the player's vulnerability zone occupies and the relative position of the opponent, they should be able to safely guage the safe distance to which they can extend so as to ensure a safe return. Taking large amounts of territory quickly is psychologically damaging to an opponent, especially one undergoing attacks on multiple fronts.

Splix draw out2

If all else fails to overcome a strong opponent, a reasonable retreat followed by an alternate attack vector may result in success. Since in splix we know that an safe zone corresponds to only one enemy, if we can guage / guess where they are, we can attempt to alter the dynamics of the battle accordingly. If we see that an opponent has stretch far beyond the mass of his safe zone via a single / thin channel, we can attempt to cut them off, preventing a retreat without crossing your sphere of influence.

Splix changing

This dynamic becomes even more pronounced if we can encircle an opponent, and start slowly reducing his control of the board. By slowly but mechanically & gradually taking enemy territory we can drive an opponent in a desired direction, perhaps towards a wall or other player.

Splix tactics2

Regardless of the situation, the true strategist will always be shuffling his tactics and actions to adapt to the board and setup the conditions for guaranteed victory. At no point should another player be underestimated or trusted. Even a new player with little territory can pose a threat to the top of the leader board given the right conditions and timing. The victorious will stay clam in the heat of the the battle, and use careful observations, timing, and quick reflexes to win the game.

(<endnote> the game *requires* a keyboard, it can be played via smartphone (swapping) but the arrow keys yields the fastest feedback</endnode>)