Skip to content

PV- Paging Documentation

March 3, 2011

Here is the documentation for PV-Paging in Xen.

Xen mmu operations

This file contains the various mmu fetch and update operations.
The most important job they must perform is the mapping between the
domain’s pfn and the overall machine mfns.

Xen allows guests to directly update the pagetable, in a controlled fashion. In other words, the guest modifies the same pagetable
that the CPU actually uses, which eliminates the overhead of having a separate shadow pagetable.

In order to allow this, it falls on the guest domain to map its notion of a “physical” pfn – which is just a domain-local linear
address – into a real “machine address” which the CPU’s MMU can use.

A pgd_t/pmd_t/pte_t will typically contain an mfn, and so can be inserted directly into the pagetable. When creating a new
pte/pmd/pgd, it converts the passed pfn into an mfn. Conversely,when reading the content back with __(pgd|pmd|pte)_val, it converts the mfn back into a pfn.The other constraint is that all pages which make up a pagetable
must be mapped read-only in the guest. This prevents uncontrolled guest updates to the pagetable. Xen strictly enforces this, and
will disallow any pagetable update which will end up mapping a pagetable page RW, and will disallow using any writable page as a
pagetable.

Naively, when loading %cr3 with the base of a new pagetable, Xen would need to validate the whole pagetable before going on.
Naturally, this is quite slow. The solution is to “pin” a pagetable, which enforces all the constraints on the pagetable even
when it is not actively in use. This menas that Xen can be assured that it is still valid when you do load it into %cr3, and doesn’t
need to revalidate it. Note about cr3 (pagetable base) values. xen_cr3 contains the current logical cr3 value; it contains the
last set cr3. This may not be the current effective cr3, because its update may be being lazily deferred. However, a vcpu looking
at its own cr3 can use this value knowing that it everything will be self-consistent.

xen_current_cr3 contains the actual vcpu cr3; it is set once the  hypercall to set the vcpu cr3 is complete (so it may be a little
out of date, but it will never be set early). If one vcpu is looking at another vcpu’s cr3 value, it should use this variable.
Xen leaves the responsibility for maintaining p2m mappings to the guests themselves, but it must also access and update the p2m array during suspend/resume when all the pages are reallocated.

The p2m table is logically a flat array, but we implement it as a three-level tree to allow the address space to be sparse.

The p2m_mid_mfn pages are mapped by p2m_top_mfn_p. The p2m_top and p2m_top_mfn levels are limited to 1 page, so the
maximum representable pseudo-physical address space is:
P2M_TOP_PER_PAGE * P2M_MID_PER_PAGE * P2M_PER_PAGE pages

P2M_PER_PAGE depends on the architecture, as a mfn is always unsigned long (8 bytes on 64-bit, 4 bytes on 32), leading to
512 and 1024 entries respectively.

We can construct this by grafting the Xen provided pagetable into head_64.S’s preconstructed pagetables. We copy the Xen L2’s into
level2_ident_pgt, level2_kernel_pgt and level2_fixmap_pgt. This means that only the kernel has a physical mapping to start with –
but that’s enough to get __va working. We need to fill in the rest of the physical mapping once some sort of allocator has been set
up.

Advertisements

From → Xen

3 Comments
  1. Satyajeet Nimgaonkar permalink

    Hi,
    I am working on a research project which involves mapping Dom U kernel memory into Dom 0. I am using the function xc_map_foreign_range() to achieve this. This function takes the mfn as an argument. Hence to get the mfn, I am planning to get the page table base address (value of CR3) for one of DomU’s VCPUs and then map the kernel pages by walking that part of the page table, which will contain MFNs.
    I read the PV- Paging Documentation above and had question as to how and where should I look in the xen source to get the current value of cr3
    Any help would be greatly appreciated.
    Thanks.
    Regards,
    Satyajeet

    • For Paging related operations in xen check this file xen/arch/x86/mmc. and also check other files and sub directories.
      I would also suggest to read the documentation given in above file (xen/arch/x86/mmc). You can use vcpu->arch.cr3 to get current value of cr3 for a domain. you can use make_cr3 function to update cr3 value.

      Hope this will help !!!

      • Satyajeet permalink

        Thanks for the reply. I tried to get the value of cr3 using a different techniques. I am invoking a custom hypercall and attempting to read the value in the hypercall handler. Below is my handler code. Here I am creating an object of the vcpu_gues_context and trying to read the value of ctrlreg[3] using xen_cr3_to_pfn macro. But I am getting the value of this (i.e. pgdaddr) as 0. I also found a function called read_cr3(). And I am getting some 8 bit values for this function e.g. 27ca4000. Does this look right. Also then what is the difference between read_cr3 and xen_cr3_to_pfn. I read in some documentation that read_cr3 is not reliable hence use xen_cr3_to_pfn. Also it would great if you tell what is wrong in my xen_cr3_to_pfn implementation as I am getting the value of pdgaddr as 0.

        unsigned long CR3;
        unsigned long pgdaddr;
        vcpu_guest_context_t ctx;

        void do_jeet1(void){

        printk (“Successfull Hypercall made to __HYPERVISOR_jeet1\n”);
        CR3 = read_cr3();

        pgdaddr = (xen_cr3_to_pfn(ctx.ctrlreg[3])) >> 12;

        printk (“CR3:%lx\n”, CR3);
        printk (“PGDADDR:%lx\n”, pgdaddr);

        }

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: