Skip to content
Open
Show file tree
Hide file tree
Changes from 8 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions cmake/vars.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -103,6 +103,7 @@ set(HTTP_SERVER_HEADERS
http/server/HttpContext.h
http/server/HttpResponseWriter.h
http/server/WebSocketServer.h
http/server/FileCache.h
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个头文件不需要暴露出去吧

)
Comment on lines 100 to 106
Copy link

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FileCacheEx.h is added to the CMake installed header list, but the repo also has a Makefile-based install flow (Makefile.vars) with its own HTTP_SERVER_HEADERS list. To keep both build systems consistent, please add http/server/FileCacheEx.h to Makefile.vars as well (otherwise make install won’t install the new public header).

Copilot uses AI. Check for mistakes.

set(MQTT_HEADERS
Expand Down
82 changes: 54 additions & 28 deletions http/server/FileCache.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -5,18 +5,21 @@
#include "htime.h"
#include "hlog.h"

#include "httpdef.h" // import http_content_type_str_by_suffix
#include "httpdef.h" // import http_content_type_str_by_suffix
#include "http_page.h" // import make_index_of_page

#ifdef OS_WIN
#include "hstring.h" // import hv::utf8_to_wchar
#include "hstring.h" // import hv::utf8_to_wchar
#endif

#define ETAG_FMT "\"%zx-%zx\""

FileCache::FileCache(size_t capacity) : hv::LRUCache<std::string, file_cache_ptr>(capacity) {
stat_interval = 10; // s
expired_time = 60; // s
FileCache::FileCache(size_t capacity)
: hv::LRUCache<std::string, file_cache_ptr>(capacity) {
stat_interval = 10; // s
expired_time = 60; // s
max_header_length = FILE_CACHE_DEFAULT_HEADER_LENGTH;
max_file_size = FILE_CACHE_DEFAULT_MAX_FILE_SIZE;
}

file_cache_ptr FileCache::Open(const char* filepath, OpenParam* param) {
Expand All @@ -26,6 +29,7 @@ file_cache_ptr FileCache::Open(const char* filepath, OpenParam* param) {
#endif
bool modified = false;
if (fc) {
std::lock_guard<std::mutex> lock(fc->mutex);
time_t now = time(NULL);
if (now - fc->stat_time > stat_interval) {
fc->stat_time = now;
Expand Down Expand Up @@ -53,19 +57,18 @@ file_cache_ptr FileCache::Open(const char* filepath, OpenParam* param) {
#endif
int fd = -1;
#ifdef OS_WIN
if(wfilepath.empty()) wfilepath = hv::utf8_to_wchar(filepath);
if(_wstat(wfilepath.c_str(), (struct _stat*)&st) != 0) {
if (wfilepath.empty()) wfilepath = hv::utf8_to_wchar(filepath);
if (_wstat(wfilepath.c_str(), (struct _stat*)&st) != 0) {
param->error = ERR_OPEN_FILE;
return NULL;
}
if(S_ISREG(st.st_mode)) {
if (S_ISREG(st.st_mode)) {
fd = _wopen(wfilepath.c_str(), flags);
}else if (S_ISDIR(st.st_mode)) {
// NOTE: open(dir) return -1 on windows
} else if (S_ISDIR(st.st_mode)) {
fd = 0;
}
#else
if(stat(filepath, &st) != 0) {
if (::stat(filepath, &st) != 0) {
param->error = ERR_OPEN_FILE;
return NULL;
}
Expand All @@ -75,62 +78,84 @@ file_cache_ptr FileCache::Open(const char* filepath, OpenParam* param) {
param->error = ERR_OPEN_FILE;
return NULL;
}
defer(if (fd > 0) { close(fd); })
#ifdef OS_WIN
defer(if (fd > 0) { close(fd); }) // fd=0 is Windows directory sentinel
#else
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Windows, using fd=0 as a directory sentinel combined with defer(if (fd > 0) close(fd);) can leak a real file descriptor if _wopen() happens to return 0 (possible if stdin is closed). Consider using a non-valid sentinel (e.g., -1/-2) for directories or tracking is_dir separately, and close any real fd with fd >= 0 when appropriate.

Copilot uses AI. Check for mistakes.
defer(close(fd);)
#endif
if (fc == NULL) {
if (S_ISREG(st.st_mode) ||
(S_ISDIR(st.st_mode) &&
filepath[strlen(filepath)-1] == '/')) {
filepath[strlen(filepath) - 1] == '/')) {
fc = std::make_shared<file_cache_t>();
fc->filepath = filepath;
fc->st = st;
fc->header_reserve = max_header_length;
time(&fc->open_time);
fc->stat_time = fc->open_time;
fc->stat_cnt = 1;
put(filepath, fc);
}
else {
// NOTE: do NOT put() into cache yet — defer until fully initialized
} else {
Comment on lines 87 to +95
Copy link

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deferring put() until after initialization avoids exposing partially-filled entries, but it also allows concurrent cache misses for the same filepath to start duplicate reads/allocations and then race to put() the final entry. Consider adding an in-flight load coordination mechanism (eg, insert a placeholder entry early plus a ‘loading’ state/condvar, or a per-key mutex) so only one thread loads a given file at a time while others wait or reuse the same entry.

Copilot uses AI. Check for mistakes.
param->error = ERR_MISMATCH;
return NULL;
}
}
// Hold fc->mutex for the remainder of initialization
std::lock_guard<std::mutex> lock(fc->mutex);
if (S_ISREG(fc->st.st_mode)) {
param->filesize = fc->st.st_size;
// FILE
if (param->need_read) {
if (fc->st.st_size > param->max_read) {
param->error = ERR_OVER_LIMIT;
// Don't cache incomplete entries
return NULL;
}
fc->resize_buf(fc->st.st_size);
int nread = read(fd, fc->filebuf.base, fc->filebuf.len);
if (nread != fc->filebuf.len) {
hloge("Failed to read file: %s", filepath);
param->error = ERR_READ_FILE;
return NULL;
fc->resize_buf(fc->st.st_size, max_header_length);
// Loop to handle partial reads (EINTR, etc.)
char* dst = fc->filebuf.base;
size_t remaining = fc->filebuf.len;
while (remaining > 0) {
ssize_t nread = read(fd, dst, remaining);
if (nread < 0) {
if (errno == EINTR) continue;
hloge("Failed to read file: %s", filepath);
param->error = ERR_READ_FILE;
return NULL;
}
if (nread == 0) {
hloge("Unexpected EOF reading file: %s", filepath);
param->error = ERR_READ_FILE;
return NULL;
}
Comment on lines +122 to +137
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

read(2) returns ssize_t, but the loop stores it in an int. If max_read is configured above INT_MAX (or on platforms where ssize_t is wider), this can truncate and break the loop logic. Use ssize_t for nread (and pass a size_t count that is capped to SSIZE_MAX if needed).

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

暂不考虑超过INT_MAX的情况

dst += nread;
remaining -= nread;
}
}
const char* suffix = strrchr(filepath, '.');
if (suffix) {
http_content_type content_type = http_content_type_enum_by_suffix(suffix+1);
http_content_type content_type = http_content_type_enum_by_suffix(suffix + 1);
if (content_type == TEXT_HTML) {
fc->content_type = "text/html; charset=utf-8";
} else if (content_type == TEXT_PLAIN) {
fc->content_type = "text/plain; charset=utf-8";
} else {
fc->content_type = http_content_type_str_by_suffix(suffix+1);
fc->content_type = http_content_type_str_by_suffix(suffix + 1);
}
}
}
else if (S_ISDIR(fc->st.st_mode)) {
} else if (S_ISDIR(fc->st.st_mode)) {
// DIR
std::string page;
make_index_of_page(filepath, page, param->path);
fc->resize_buf(page.size());
fc->resize_buf(page.size(), max_header_length);
memcpy(fc->filebuf.base, page.c_str(), page.size());
fc->content_type = "text/html; charset=utf-8";
}
gmtime_fmt(fc->st.st_mtime, fc->last_modified);
snprintf(fc->etag, sizeof(fc->etag), ETAG_FMT, (size_t)fc->st.st_mtime, (size_t)fc->st.st_size);
snprintf(fc->etag, sizeof(fc->etag), ETAG_FMT,
(size_t)fc->st.st_mtime, (size_t)fc->st.st_size);
// Cache the fully initialized entry
put(filepath, fc);
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential deadlock: Open() holds fc->mutex for the remainder of initialization and then calls put(filepath, fc), which acquires the LRUCache internal mutex. Meanwhile RemoveExpiredFileCache() acquires the LRUCache mutex (via remove_if) and then locks fc->mutex. This lock-order inversion can deadlock under concurrent traffic. Avoid calling put/remove/contains/remove_if while holding fc->mutex, or change the expiration logic to avoid locking fc->mutex while LRUCache is locked (e.g., snapshot fields needed for eviction without taking fc->mutex, or collect keys to remove and erase them after releasing the cache lock).

Copilot uses AI. Check for mistakes.
}
return fc;
}
Expand All @@ -154,6 +179,7 @@ file_cache_ptr FileCache::Get(const char* filepath) {
void FileCache::RemoveExpiredFileCache() {
time_t now = time(NULL);
remove_if([this, now](const std::string& filepath, const file_cache_ptr& fc) {
std::lock_guard<std::mutex> lock(fc->mutex);
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove_if runs the predicate while holding the LRUCache internal mutex; locking fc->mutex inside the predicate introduces a lock-order inversion with Open() (which locks fc->mutex and later calls put()). This can deadlock. Consider avoiding per-entry locking inside remove_if predicates (e.g., store stat_time as an atomic, or do a two-phase approach: gather expired keys under cache lock without locking entries, then remove them).

Suggested change
std::lock_guard<std::mutex> lock(fc->mutex);
std::unique_lock<std::mutex> lock(fc->mutex, std::try_to_lock);
if (!lock.owns_lock()) {
return false;
}

Copilot uses AI. Check for mistakes.
return (now - fc->stat_time > expired_time);
});
}
116 changes: 89 additions & 27 deletions http/server/FileCache.h
Original file line number Diff line number Diff line change
@@ -1,90 +1,152 @@
#ifndef HV_FILE_CACHE_H_
#define HV_FILE_CACHE_H_

/*
* FileCache — Enhanced File Cache for libhv HTTP server
*
* Features:
* 1. Configurable max_header_length (default 4096, tunable per-instance)
* 2. prepend_header() returns bool to report success/failure
* 3. Exposes header/buffer metrics via accessors
* 4. Fixes stat() name collision in is_modified()
* 5. max_cache_num / max_file_size configurable at runtime
* 6. Reserved header space can be tuned per-instance
* 7. Source-level API compatible; struct layout differs from original (no ABI/layout compatibility)
*/

#include <memory>
#include <map>
#include <string>
#include <mutex>

#include "hexport.h"
#include "hbuf.h"
#include "hstring.h"
#include "LRUCache.h"

#define HTTP_HEADER_MAX_LENGTH 1024 // 1K
#define FILE_CACHE_MAX_NUM 100
#define FILE_CACHE_MAX_SIZE (1 << 22) // 4M
// Default values — may be overridden at runtime via FileCache setters
#define FILE_CACHE_DEFAULT_HEADER_LENGTH 4096 // 4K
#define FILE_CACHE_DEFAULT_MAX_NUM 100
#define FILE_CACHE_DEFAULT_MAX_FILE_SIZE (1 << 22) // 4M

typedef struct file_cache_s {
mutable std::mutex mutex; // protects all mutable state below
std::string filepath;
struct stat st;
time_t open_time;
time_t stat_time;
uint32_t stat_cnt;
HBuf buf; // http_header + file_content
hbuf_t filebuf;
hbuf_t httpbuf;
HBuf buf; // header_reserve + file_content
hbuf_t filebuf; // points into buf: file content region
hbuf_t httpbuf; // points into buf: header + file content after prepend
char last_modified[64];
char etag[64];
std::string content_type;

// --- new: expose header metrics ---
int header_reserve; // reserved bytes before file content
int header_used; // actual bytes used by prepend_header

file_cache_s() {
stat_cnt = 0;
header_reserve = FILE_CACHE_DEFAULT_HEADER_LENGTH;
header_used = 0;
memset(last_modified, 0, sizeof(last_modified));
memset(etag, 0, sizeof(etag));
}

// NOTE: caller must hold mutex.
// On Windows, Open() uses _wstat() directly instead of calling this.
bool is_modified() {
time_t mtime = st.st_mtime;
stat(filepath.c_str(), &st);
::stat(filepath.c_str(), &st);
return mtime != st.st_mtime;
}

// NOTE: caller must hold mutex
bool is_complete() {
if(S_ISDIR(st.st_mode)) return filebuf.len > 0;
return filebuf.len == st.st_size;
if (S_ISDIR(st.st_mode)) return filebuf.len > 0;
return filebuf.len == (size_t)st.st_size;
}

void resize_buf(int filesize) {
buf.resize(HTTP_HEADER_MAX_LENGTH + filesize);
filebuf.base = buf.base + HTTP_HEADER_MAX_LENGTH;
// NOTE: caller must hold mutex — invalidates filebuf/httpbuf pointers
void resize_buf(size_t filesize, int reserved) {
if (reserved < 0) reserved = 0;
header_reserve = reserved;
buf.resize((size_t)reserved + filesize);
filebuf.base = buf.base + reserved;
filebuf.len = filesize;
// Invalidate httpbuf since buffer may have been reallocated
httpbuf.base = NULL;
httpbuf.len = 0;
header_used = 0;
}

void resize_buf(size_t filesize) {
resize_buf(filesize, header_reserve);
}

void prepend_header(const char* header, int len) {
if (len > HTTP_HEADER_MAX_LENGTH) return;
// Thread-safe: prepend header into reserved space.
// Returns true on success, false if header exceeds reserved space.
bool prepend_header(const char* header, int len) {
std::lock_guard<std::mutex> lock(mutex);
if (len <= 0 || len > header_reserve) return false;
httpbuf.base = filebuf.base - len;
httpbuf.len = len + filebuf.len;
httpbuf.len = (size_t)len + filebuf.len;
Comment on lines +82 to +91
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

file_cache_t::prepend_header now returns false when the header doesn’t fit, but it leaves httpbuf unchanged (and resize_buf() explicitly invalidates httpbuf). Callers that still read fc->httpbuf unconditionally can end up sending an empty/invalid response. Consider either (1) always setting httpbuf to a safe fallback (e.g., point at filebuf / clear and document) on failure, and/or (2) requiring/updating callers to check the return value and fall back to non-cached header+body sending.

Copilot uses AI. Check for mistakes.
memcpy(httpbuf.base, header, len);
header_used = len;
return true;
}
Comment on lines +79 to 95
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prepend_header() writes into the shared reserved space of the cached entry and updates httpbuf to point at that region. Even with the mutex, callers read/use fc->httpbuf after the lock is released, so concurrent requests can observe httpbuf changing underneath them or have their header bytes overwritten. Consider redesigning to keep the cache entry immutable for serving (store only the body) and construct headers per request without modifying the shared entry.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

暂不处理


// --- thread-safe accessors ---
int get_header_reserve() const { std::lock_guard<std::mutex> lock(mutex); return header_reserve; }
int get_header_used() const { std::lock_guard<std::mutex> lock(mutex); return header_used; }
int get_header_remaining() const { std::lock_guard<std::mutex> lock(mutex); return header_reserve - header_used; }
bool header_fits(int len) const { std::lock_guard<std::mutex> lock(mutex); return len > 0 && len <= header_reserve; }
} file_cache_t;

typedef std::shared_ptr<file_cache_t> file_cache_ptr;
typedef std::shared_ptr<file_cache_t> file_cache_ptr;

class FileCache : public hv::LRUCache<std::string, file_cache_ptr> {
class HV_EXPORT FileCache : public hv::LRUCache<std::string, file_cache_ptr> {
public:
int stat_interval;
int expired_time;
// --- configurable parameters (were hardcoded macros before) ---
int stat_interval; // seconds between stat() checks
int expired_time; // seconds before cache entry expires
int max_header_length; // reserved header bytes per entry
int max_file_size; // max cached file size (larger = large-file path)

FileCache(size_t capacity = FILE_CACHE_MAX_NUM);
explicit FileCache(size_t capacity = FILE_CACHE_DEFAULT_MAX_NUM);

struct OpenParam {
bool need_read;
int max_read;
const char* path;
size_t filesize;
int error;
bool need_read;
int max_read; // per-request override for max file size
const char* path; // URL path (for directory listing)
size_t filesize; // [out] actual file size
int error; // [out] error code if Open returns NULL
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

max_file_size / OpenParam::max_read are declared as int, but they represent a byte size and are compared against st.st_size (typically off_t, potentially >2GB). Using int can overflow/truncate on large files and makes it harder to configure sizes beyond INT_MAX. Consider switching these fields and related APIs to size_t (or uint64_t) so large-file thresholds work correctly on 64-bit platforms.

Copilot uses AI. Check for mistakes.

OpenParam() {
need_read = true;
max_read = FILE_CACHE_MAX_SIZE;
max_read = FILE_CACHE_DEFAULT_MAX_FILE_SIZE;
path = "/";
Comment on lines 122 to 125
Copy link

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OpenParam::max_read defaults to FILE_CACHE_DEFAULT_MAX_FILE_SIZE, so calling FileCache::SetMaxFileSize() does not affect Open() for callers that rely on the default OpenParam (they’ll still use 4MB unless they manually override max_read). Consider making Open() apply max_file_size as the default/upper bound when max_read is unset, or initialize OpenParam::max_read from the owning FileCache’s max_file_size to keep the API behavior consistent.

Copilot uses AI. Check for mistakes.
filesize = 0;
error = 0;
}
Comment on lines 115 to 128
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OpenParam::max_read defaults to FILE_CACHE_DEFAULT_MAX_FILE_SIZE, while FileCache now has a runtime-configurable max_file_size (and HttpServer sets it from service->max_file_cache_size). Callers that don't explicitly set param.max_read will silently ignore the instance’s configured max_file_size, which can lead to inconsistent caching behavior across call sites. Consider defaulting OpenParam::max_read from the owning FileCache instance (e.g., in Open() when param->max_read is unset/0), or remove one of these knobs to keep a single source of truth.

Copilot uses AI. Check for mistakes.
};

file_cache_ptr Open(const char* filepath, OpenParam* param);
bool Exists(const char* filepath) const;
bool Close(const char* filepath);
void RemoveExpiredFileCache();

// --- new: getters ---
int GetMaxHeaderLength() const { return max_header_length; }
int GetMaxFileSize() const { return max_file_size; }
int GetStatInterval() const { return stat_interval; }
int GetExpiredTime() const { return expired_time; }

// --- new: setters ---
void SetMaxHeaderLength(int len) { max_header_length = len; }
void SetMaxFileSize(int size) { max_file_size = size; }
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SetMaxHeaderLength / SetMaxFileSize accept negative values, which can lead to surprising behavior (e.g., reserved header space clamped to 0 at resize time, or max_file_size < 0). It would be safer to validate/clamp in the setters (e.g., minimum 0/1) so misconfiguration fails fast and predictably.

Suggested change
void SetMaxHeaderLength(int len) { max_header_length = len; }
void SetMaxFileSize(int size) { max_file_size = size; }
void SetMaxHeaderLength(int len) { max_header_length = len < 0 ? 0 : len; }
void SetMaxFileSize(int size) { max_file_size = size < 1 ? 1 : size; }

Copilot uses AI. Check for mistakes.

protected:
file_cache_ptr Get(const char* filepath);
};
Expand Down
4 changes: 2 additions & 2 deletions http/server/HttpHandler.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -842,8 +842,8 @@ int HttpHandler::GetSendData(char** data, size_t* len) {
}
case SEND_DONE:
{
// NOTE: remove file cache if > FILE_CACHE_MAX_SIZE
if (fc && fc->filebuf.len > FILE_CACHE_MAX_SIZE) {
// NOTE: remove file cache if > max_file_size
if (fc && fc->filebuf.len > files->GetMaxFileSize()) {
Copy link

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This compares size_t (fc->filebuf.len) with int (GetMaxFileSize()). To avoid signed/unsigned warnings and edge cases when values exceed INT_MAX, consider making GetMaxFileSize() return size_t (or cast the return value to size_t at the call site).

Suggested change
if (fc && fc->filebuf.len > files->GetMaxFileSize()) {
if (fc && fc->filebuf.len > static_cast<size_t>(files->GetMaxFileSize())) {

Copilot uses AI. Check for mistakes.
files->Close(fc->filepath.c_str());
}
Comment on lines +849 to 852
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

files->GetMaxFileSize() is used here to decide whether to evict the cached entry, but FileCache::max_file_size is never wired up to HttpService::max_file_cache_size (which controls OpenParam.max_read). If max_file_cache_size is increased, entries may still be evicted immediately due to the default 4MB FileCache::max_file_size. Consider initializing FileCache::max_file_size from service->max_file_cache_size when the server starts (same place stat_interval/expired_time are configured) so caching behavior is consistent.

Suggested change
// NOTE: remove file cache if > max_file_size
if (fc && fc->filebuf.len > files->GetMaxFileSize()) {
files->Close(fc->filepath.c_str());
}
// Avoid immediately evicting the just-served cached file based on a
// potentially stale FileCache max_file_size setting. Cache size policy
// should be enforced where the cache is configured/populated so it stays
// consistent with the service-level max_file_cache_size.

Copilot uses AI. Check for mistakes.
fc = NULL;
Expand Down
Loading