Skip to content

Commit

Permalink
ESP32-S3: Support execute in place from PSRAM
Browse files Browse the repository at this point in the history
This implementation mirrors how the ESP-IDF implementation of this
feature (which is based on the `Cache_Flash_To_SPIRAM_Copy` rom
function) works except it differs in a few key ways:

The ESP-IDF seems to map `.text` and `.rodata` into the first and second
128 cache pages respectively (although looking at the linker scripts,
I'm not sure how, but a runtime check confirmed this seemed to be the
case). This is reflected in how the `Cache_Count_Flash_Pages`,
`Cache_Flash_To_SPIRAM_Copy` rom functions and the ESP-IDF code
executing them works. The count function can only be made to count flash
pages within the first 256 pages (of which there are 512 on the
ESP32-S3). Likewise, the copy function will only copy flash pages which
are mapped within the first 256 entries (across two calls). As the
esp-hal handles mapping `.text` and `.rodata` differently, these ROM
functions are technically not appropriate if more than 256 pages of
flash (`.text` and `.rodata` combined) are in use by the application.

Additionally, the functions both contain bugs, one of which the IDF
attempts to work around incorrectly, and the other which the IDF does
not appear to be aware of. Details of these bugs can be found on the IDF
issue/PR tracker[0][1].

As a result, this commit contains a heavily modified/adjusted rust
re-write of the reverse engineered ROM code combined with a vague port
of the ESP-IDF code.

There are three additional noteworthy differences from the ESP-IDF version
of the code:

1. The ESP-IDF allows the `.text` and `.rodata` segments to be mapped
   independently and separately allowing only one to be mapped. But the
   current version of the code does not allow this flexibility. This can
   be implemented by checking the address of each page entry against the
   segment locations to determine which segment each address belongs to.
2. The ESP-IDF calls
   `cache_ll_l1_enable_bus(..., cache_ll_l1_get_bus(..., SOC_EXTRAM_DATA_HIGH, 0));`
   (functions from the ESP-IDF) in order to "Enable the most high bus,
   which is used for copying FLASH `.text` to PSRAM" but on the ESP32-S3
   after careful inspection these calls result in a no-op as the address
   passed to cache_ll_l1_get_bus will result in an empty cache bus mask.
   It's currently unclear to me if this is a bug in the ESP-IDF code, or
   if this code (which from cursory investigation is probably not a
   no-op on the -S2) is solely targetting the ESP32-S3.
3. The ESP-IDF calls `Cache_Flash_To_SPIRAM_Copy` with an icache address
   when copying `.text` and a dcache address when copying `.rodata`.
   This affects which cache the reads will occur through. But the writes
   always go through a "spare page" (name I came up with during reverse
   engineering) via the dcache. This code performs all reads through the
   dcache. I don't know if there's a proper reason to read through the
   correct cache when doing the copy and this doesn't appear to have any
   negative impact.

[0]: espressif/esp-idf#15262
[1]: espressif/esp-idf#15263
  • Loading branch information
EliteTK committed Jan 25, 2025
1 parent e6e7a41 commit 98a5d15
Show file tree
Hide file tree
Showing 3 changed files with 150 additions and 4 deletions.
1 change: 1 addition & 0 deletions esp-hal/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
### Added
- SPI: Added support for 3-wire SPI (#2919)
- Add separate config for Rx and Tx (UART) #2965
- ESP32-S3: Support execute in place from PSRAM

### Changed

Expand Down
121 changes: 121 additions & 0 deletions esp-hal/src/soc/esp32s3/mmu.rs
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,11 @@
const DBUS_VADDR_BASE: u32 = 0x3C000000;
const DR_REG_MMU_TABLE: u32 = 0x600C5000;
const ENTRY_ACCESS_FLASH: u32 = 0;
const ENTRY_INVALID: u32 = 1 << 14;
const ENTRY_TYPE: u32 = 1 << 15;
const ENTRY_VALID: u32 = 0;
const ENTRY_VALID_VAL_MASK: u32 = 0x3fff;
const ICACHE_MMU_SIZE: usize = 0x800;

pub(super) const ENTRY_ACCESS_SPIRAM: u32 = 1 << 15;
Expand All @@ -29,6 +33,10 @@ extern "C" {
num: u32,
fixed: u32,
) -> i32;

fn Cache_Invalidate_Addr(addr: u32, size: u32);
fn Cache_WriteBack_All();
fn rom_Cache_WriteBack_Addr(addr: u32, size: u32);
}

#[procmacros::ram]
Expand All @@ -43,3 +51,116 @@ pub(super) fn last_mapped_index() -> Option<usize> {
pub(super) fn index_to_data_address(index: usize) -> u32 {
DBUS_VADDR_BASE + (PAGE_SIZE * index) as u32
}

/// Count flash-mapped pages, de-duplicating mappings which refer to flash page
/// 0
#[procmacros::ram]
pub(super) fn count_effective_flash_pages() -> usize {
let mmu_table_ptr = DR_REG_MMU_TABLE as *const u32;
let mut page0_seen = false;
let mut flash_pages = 0;
for i in 0..(TABLE_SIZE - 1) {
let mapping = unsafe { mmu_table_ptr.add(i).read_volatile() };
if mapping & (ENTRY_INVALID | ENTRY_TYPE) == ENTRY_VALID | ENTRY_ACCESS_FLASH {
if mapping & ENTRY_VALID_VAL_MASK == 0 {
if page0_seen {
continue;
}
page0_seen = true;
}
flash_pages += 1;
}
}
flash_pages
}

#[procmacros::ram]
unsafe fn move_flash_to_psram_with_spare(
target_entry: usize,
psram_page: usize,
spare_entry: usize,
) {
let mmu_table_ptr = DR_REG_MMU_TABLE as *mut u32;
let target_entry_addr = DBUS_VADDR_BASE + (target_entry * PAGE_SIZE) as u32;
let spare_entry_addr = DBUS_VADDR_BASE + (spare_entry * PAGE_SIZE) as u32;
unsafe {
mmu_table_ptr
.add(spare_entry)
.write_volatile(psram_page as u32 | ENTRY_ACCESS_SPIRAM);
Cache_Invalidate_Addr(spare_entry_addr, PAGE_SIZE as u32);
core::ptr::copy_nonoverlapping(
target_entry_addr as *const u8,
spare_entry_addr as *mut u8,
PAGE_SIZE,
);
rom_Cache_WriteBack_Addr(spare_entry_addr, PAGE_SIZE as u32);
mmu_table_ptr
.add(target_entry)
.write_volatile(psram_page as u32 | ENTRY_ACCESS_SPIRAM);
}
}

/// Copy flash-mapped pages to PSRAM, copying flash-page 0 only once, and re-map
/// those pages to the PSRAM copies
#[procmacros::ram]
pub(super) unsafe fn copy_flash_to_psram_and_remap(free_page: usize) -> usize {
let mmu_table_ptr = DR_REG_MMU_TABLE as *mut u32;

const SPARE_PAGE: usize = TABLE_SIZE - 1;
const SPARE_PAGE_DCACHE_ADDR: u32 = DBUS_VADDR_BASE + (SPARE_PAGE * PAGE_SIZE) as u32;

let spare_page_mapping = unsafe { mmu_table_ptr.add(SPARE_PAGE).read_volatile() };
let mut page0_page = None;
let mut psram_page = free_page;

unsafe { Cache_WriteBack_All() };
for i in 0..(TABLE_SIZE - 1) {
let mapping = unsafe { mmu_table_ptr.add(i).read_volatile() };
if mapping & (ENTRY_INVALID | ENTRY_TYPE) != ENTRY_VALID | ENTRY_ACCESS_FLASH {
continue;
}
if mapping & ENTRY_VALID_VAL_MASK == 0 {
match page0_page {
Some(page) => {
unsafe {
mmu_table_ptr
.add(i)
.write_volatile(page as u32 | ENTRY_ACCESS_SPIRAM)
};
continue;
}
None => page0_page = Some(psram_page),
}
}
unsafe { move_flash_to_psram_with_spare(i, psram_page, SPARE_PAGE) };
psram_page += 1;
}

// Restore spare page mapping
unsafe {
mmu_table_ptr
.add(SPARE_PAGE)
.write_volatile(spare_page_mapping);
Cache_Invalidate_Addr(SPARE_PAGE_DCACHE_ADDR, PAGE_SIZE as u32);
}

// Special handling if the spare page was mapped to flash
if spare_page_mapping & (ENTRY_INVALID | ENTRY_TYPE) == ENTRY_VALID | ENTRY_ACCESS_FLASH {
unsafe {
// We're running from ram so using the first page should not cause issues
const SECOND_SPARE: usize = 0;
let second_spare_mapping = mmu_table_ptr.add(SECOND_SPARE).read_volatile();

move_flash_to_psram_with_spare(SPARE_PAGE, psram_page, SECOND_SPARE);

// Restore spare page mapping
mmu_table_ptr.add(0).write_volatile(second_spare_mapping);
Cache_Invalidate_Addr(
DBUS_VADDR_BASE + (SECOND_SPARE * PAGE_SIZE) as u32,
PAGE_SIZE as u32,
);
}
psram_page += 1;
}
psram_page - free_page
}
32 changes: 28 additions & 4 deletions esp-hal/src/soc/esp32s3/psram.rs
Original file line number Diff line number Diff line change
Expand Up @@ -108,6 +108,13 @@ pub struct PsramConfig {
pub flash_frequency: FlashFreq,
/// Frequency of PSRAM memory
pub ram_frequency: SpiRamFreq,
/// Copy code and read-only data from flash to PSRAM and remap the
/// respective pages to point to PSRAM
///
/// Refer to
/// https://docs.espressif.com/projects/esp-idf/en/stable/esp32s3/api-guides/external-ram.html#execute-in-place-xip-from-psram
/// for more information.
pub execute_from_psram: bool,
}

/// Initialize PSRAM to be used for data.
Expand All @@ -124,7 +131,9 @@ pub(crate) fn init_psram(config: PsramConfig) {
const CONFIG_ESP32S3_DATA_CACHE_SIZE: u32 = 0x8000;
const CONFIG_ESP32S3_DCACHE_ASSOCIATED_WAYS: u8 = 8;
const CONFIG_ESP32S3_DATA_CACHE_LINE_SIZE: u8 = 32;
const START_PAGE: u32 = 0;

let mut free_page = 0;
let mut psram_size = config.size.get();

extern "C" {
fn rom_config_instruction_cache_mode(
Expand All @@ -144,8 +153,23 @@ pub(crate) fn init_psram(config: PsramConfig) {
fn Cache_Resume_DCache(param: u32);
}

let start = unsafe {
// Vaguely based off of the ESP-IDF equivalent code:
// https://github.com/espressif/esp-idf/blob/3c99557eeea4e0945e77aabac672fbef52294d54/components/esp_psram/mmu_psram_flash.c#L46-L134
if config.execute_from_psram {
let flash_pages = mmu::count_effective_flash_pages();
let psram_pages = psram_size / mmu::PAGE_SIZE;

if flash_pages > psram_pages {
panic!("Cannot execute from PSRAM: The number of PSRAM pages ({}) is too small to fit {} flash pages", psram_pages, flash_pages);
}

let psram_pages_used = unsafe { mmu::copy_flash_to_psram_and_remap(free_page) };

free_page += psram_pages_used;
psram_size -= psram_pages_used * mmu::PAGE_SIZE;
}

let start = unsafe {
// calculate the PSRAM start address to map
// the linker scripts can produce a gap between mapped IROM and DROM segments
// bigger than a flash page - i.e. we will see an unmapped memory slot
Expand Down Expand Up @@ -177,9 +201,9 @@ pub(crate) fn init_psram(config: PsramConfig) {
if mmu::cache_dbus_mmu_set(
mmu::ENTRY_ACCESS_SPIRAM,
start,
START_PAGE << 16,
(free_page as u32) << 16,
64,
config.size.get() as u32 / 1024 / 64, // number of pages to map
(psram_size / mmu::PAGE_SIZE) as u32, // number of pages to map
0,
) != 0
{
Expand Down

0 comments on commit 98a5d15

Please sign in to comment.